A database is a way to collect the information about a business in an organized manner so that it is available when needed to the person who needs it.A database is a collection of schemes, tables, queries, reports, views, and other elements. Data in the database can be changed i.e. updated and new data when acquired can be put into the database.
Database typically organize the data to model aspects of reality in a way that supports processes requiring information, such as availability of particular laptop that the customer wants or availability of rooms in hotels in a way that supports finding a hotel with vacancies. Database generally contains accumulation of files or records such as inventory details, sales and dispatch transactions etc. There are several types of database, they are:
Types of database
Relational databases are generally made of a set of tables with data that fits into a category that is already defined. Each table has at least one data category in a column and each row has a certain data instance for the categories which are defined in the columns. Hence forming a matrix between data and categories.
Standard user and application program for a relational database is Structured Query Language (SQL). Relational databases are easily extended and a new data category can be added after the original database creation without requiring much modification
A distributed database is a database in which some parts of the database are stored in multiple physical locations and in which processing is dispersed or replicated among different points in a network.
The distributed database is categorized in two forms. Homogeneous and heterogeneous.All the physical locations in a homogeneous distributed database system have the same underlying hardware and run the same operating systems and database applications. The hardware, operating systems or database applications in a heterogeneous distributed database may be different at each of the locations.
A cloud database is built for a virtual environment; it can be in a hybrid cloud, public cloud or private cloud. Cloud databases provide benefits such as the ability to pay for storage capacity and bandwidth on a per-user basis and they provide scalability on demand, along with high availability.
A cloud database also gives enterprises the opportunity to support business applications in a software-as-a-service deployment.
NoSQL databases should be used when there is a large set of distributed data. NoSQL databases are effective for big data performance issues that relational databases aren’t built to solve. They are most effective when an organization analyzes large chunks of unstructured data or data that are stored on multiple virtual servers in the cloud.
An object-oriented database is organized around objects rather than actions and data rather than logic. For example, a multimedia record in a relational database can be a definable data object, as opposed to an alphanumeric value.
This type of database uses graph theory to store, map and query data. Graph databases are collections of nodes and edges where each node represents an entity and each edge represents a connection between nodes.
The database is generally arranged with the help of a database management system (DBMS). A DBMS is a computer-software application that interacts with end-users, other applications, and the database itself to capture and analyze data. A general-purpose DBMS allows the definition, creation, querying, update, and administration of databases.
Data warehousing is the process of constructing and using a data warehouse. A data warehouse is constructed by integrating data from multiple heterogeneous sources that support analytical reporting, structured and/or ad-hoc queries, and decision making. Data warehousing involves data cleaning, data integration, and data consolidations.
The main functions of data warehousing are:
Data extraction i.e. Gathering or collecting data
Data cleaning i.e. finding and correcting errors in data
Data transformation i.e. converting the data from legacy format to warehouse format.
Data loading i.e. sorting, summarizing, consolidating, checking integrity, and building indices and partitions.
Refreshing i.e. updating from data sources to the warehouse.