Data Base Schema and Architecture

DATA
A representation of facts, concepts or instructions in a formalized manner suitable for communication, interpretation or processing by human being or by automatic means. Data consists of symbols written or stored on some recording medium.

DATA BASE SCHEME (SCHEMA)
Schema is a definition of some thing. A database scheme means the over all design of the data base. The database scheme include :
1. Characteristics of data objects such as entities and attributes.
2. Logical structure and relationships among these data objects.
3. Validation criteria and semantic constraints.
4. Physical storage representation (format).
5. Physical location on storage devices and media.
6. Integrity parameters such as access authorizations and back up policies.
The database scheme or schema contains all necessary and sufficient information for the system to do all the data base processing. Data base systems have several schemes partitioned according to the levels of abstraction - physical scheme (physical schema), conceptual scheme (conceptual schema), and subscheme (subschema). Database system supports one physical scheme, one conceptual scheme and several subschemes. The collection of information stored in the database at a particular moment in time is called an instance of the database.

DATA INDEPENDENCE
The immunity of applications to, changes in storage structure and access strategy
OR
The ability to modify the scheme definition in one level, without affecting the scheme definition in the next higher level.
OR
The ability to use the database without knowing the representation details is called data independence.
There are two levels of data independence physical data independence and logical data independence.
PHYSICAL DATA INDEPENDENCE
Physical data independence is the ability to modify the physical scheme without causing the application programs to be rewritten or without requiring the alteration of the conceptual scheme or redefinition of the subscheme. Modifications at a physical level are occasionally necessary in order to improve performance. For example, in ORACLE we can add new datafiles for storing the database with out modifying the database scheme.
LOGICAL DATA INDEPENDENE
The ability to modify the conceptual scheme without causing the application program to be rewritten or the redefinition of the subschemes. Modifications at the conceptual level are necessary whenever the logical structure of the database changes. For example : Addition of information about different types of new entities or extra information about existing entity. Many modifications to the conceptual scheme can be made without affecting existing subschemes. Some other modifications to the conceptual scheme can be made if we redefine the mapping from the subschema to the conceptual schema. In ORACLE, we can add columns to an existing table, with out modifying any view defined on that base table or any application previously designed on this table.
If we delete some information that corresponds to information present in the subscheme then the redefinition of mapping will not do any good. Such changes will require rewriting or discarding some application program.
Logical data independence is difficult to achieve than physical data independence since application programs are heavily dependent on the logical structure of the data they access.
DATA DEFINITION LANGUAGE
DDL is a language used for specifying the database scheme. The result of compilation of the DDL statements is a set of tables which are stored in a special file called data dictionary or directory. DDL provide facilities for the definition or the description of the database object. The DDL is not a procedural language; but it consists of statements for describing the types of entities and the relationships among entities in terms of a data model.
DDL can be used for designing or modifying the database, but it cannot be used for manipulation. There are three types of DDL - external DDL, conceptual DDL and internal DDL.

DATA STORAGE AND DEFINITION LANGUAGE
The storage structure and access methods are specified by a set of definitions in a special type of DDL called Data Storage and Definition Language (DSDL). The result of compilation of these definitions is a set of instruction to specify the implementation details of the database schemes which are hidden from the users.
DATA DICTIONARY
A data dictionary is one of the most important DBA tools. A data dictionary is a file that contains metadata or data about data. This file is consulted before the actual data is read or modified in the database system. The data dictionary is itself a data base. It contains descriptions of other objects in the system, rather than raw facts. All the various schemes are physically stored in both source and object form in the dictionary. A comprehensive dictionary will include cross reference information showing which program use which pieces of data, which department require which reports and so on. It should be possible to query the dictionary just like any other database so that the DBA can easily discover which programs are likely to be affected by some proposed changes to the system.
DATA MANUPULATION LANGUAGE
The language that enables the users to access or manipulate data as organized by the appropriate data model. There are two types of DML.
Procedural DML’s that require a user to specify what data is needed and how to get it.
Non Procedural DML’s that require a user to specify what data is needed without specifying how to get it.
DATA MANIPULATION
Data manipulation means retrieval of information stored in the database, insertion of new information into the database, deletion or modification of data stored in the database.
ARCHITECTURE OF A DATABASE SYSTEM
The architecture is divided into three general levels Internal, Conceptual and External. (Internal or Physical) (External or View). The architecture is designed for providing data abstraction, i.e., to hide the complexity from the users.
The internal level is the one close to the physical storage. This layer is concerned with the way in which the data is actually stored. The Internal view is a low-level representation of the entire data base. It consists of multiple occurrences of multiple types of internal records. It does not deal with physical records or blocks or any device specific constraints such as track size or cylinder size. The Internal view assumes an Infinite linear address space. The Internal view is described by means of the Internal Schema. It specifies the various types of stored records, the different indices, the representation of stored fields, the Physical sequence of records etc. The internal schema is written using the internal DDL. The external level is close to the user. It is concerned with the way in which the data is viewed by individual users. An external view is the content of the database as it is seen by some particular user. (i.e., to that user the external view is the database) For example, a user from the personnel department may consider the database as a collection of employee records and department records. An external view consists of multiple occurrences of multiple types of external records. Each external view is defined by means of an external schema. The external schema is defined by using the external DDL. The conceptual level is a level of indirection between the other two levels. This level will provide a community user view. The conceptual view is a representation of the entire information content of the data base. The conceptual view consists of multiple occurrences of multiple types of conceptual records. A conceptual record need not be the same as an external record or a stored record. The conceptual view is defined by means of the conceptual schema. The conceptual schema is written using conceptual DDL. If data independence is to be achieved the definition must not give any consideration for storage structure or access strategy. They must be definitions of the information contents only.
There exists two levels of mapping one between the external and the conceptual level and the other between the conceptual and the internal level - the conceptual/internal mapping defines the relationships between the conceptual view and the stored data base. It specifies how the conceptual record and field map into their stored counterparts. If the structure of the database is changed (if the change is made to the storage structure definition), the conceptual/internal mapping must be changed accordingly so that the conceptual schema remains invariant. The effects of this changes must be contained below this conceptual level so that data independence can be achieved. An external mapping defines the correspondence between a particular external view and the conceptual view. In an external view, fields may have different data types and records may be differently sequenced. Any number of external views may exist at the same time and any number of users may share the given external view.
DATA SUBLANGUAGE
Each user may use some language to interact with the database (Host language). For the application programmer, it may be conventional programming language like COBOL, PASCAL, ADA or PL-1 For the end user, it will be either a query language or a special purpose language designed for that user’s requirements and supported by an application program. User’s language will include a data sublanguage. (A subset, of the total language, that is concerned with the data base objects and operations.) The data sublanguage may be embedded in a host language. A given system may support multiple host languages and multiple data sublanguages (SQL is the only data sublanguage for Oracle). Any given data sublanguage has two components data definition language and data manipulation language.
{Consider a COBOL user, the data sublanguage of that user consists of those features of COBOL that are used to communicate with the data base. The DDL portion consists of those declarative constructs (data division) that are needed to declare the database objects. The DML portion consists of those executable statements of COBOL, that transfer information to and from the data base.}

DBMS : DBMS is the software that handles all access to the data base. Conceptually what happens is the following:
1) A user issues an access request, using some particular data manipulation language.
2) The DBMS intercepts the requests and interprets it.
3) The DBMS inspects the external schema, the external conceptual mapping, the conceptual schema, the conceptual Internal mapping and the storage structure definition.
4) The DBMS performs the necessary operations on the stored data base.
For the retrieval of a particular external record occurrence, fields will be required from several conceptual record occurrences. Each conceptual record occurrence may require several stored record occurrences. Then the DBMS must retrieve all the required stored record occurrences, construct the required conceptual occurrences and then construct the required external record occurrences. At each stage data type or other conversion may be necessary.
DATA DICTIONARY (Add the description here)
USER INTERFACE It is defined as the boundary in the system below which everything is invisible to the user. By definition, the user interface is at external level.
DBA The person or the group of persons who have overall control over the database.
The DBA’s responsibilities include:
1) DECIDING THE INFORMATION CONTENT OF THE DATA BASE: The DBA has to decide what information is to be held in the data base, i.e., to identify the entities of interest to the enterprise and to identify the information to be recorded about the entities. Having done this, the DBA then define the content of the data base by the conceptual schema using the conceptual DDL. The object form (compiled form) of the schema is used by the DBMS in responding to access request. The source form acts as a reference document for the users of the system.
2) DECIDING THE STORAGE STRUCTURE AND THE ACCESS STRATERGY: The DBA must also decide how data is represented in the data base and specify the representation by writing the storage structure definition using internal DDL. In addition the mapping between the conceptual schema and the storage structure definition must also be specified. The internal schema and the corresponding mapping will exist in source and object form.
3) LIASING WITH THE USERS: The DBA has to liase with the users to ensure that the data they require is available and to write the necessary external schemes by using external DDLs. In addition, the mapping between any given external schema and the conceptual schema must also be specified. Both the external schema and the mapping will exist in the data base in source and the object form.
4) DEFINING AUTHORISATION CHECKS AND VALIDATION PROCEDURES: Authorization checks (user creations) and validation procedures (constraints) may be considered as the logical extension of the conceptual schema. The conceptual DDL will include facilities for specifying such checks and procedures.
5) DEFINING A STRATERGY FOR BACKUP AND RECOVERY: Once an enterprise is committed to a data base system, it becomes critically dependent on the successful operation of that system. In the event of damage to any portion of the data base caused by human error, or a failure of the hardware or supporting OS - it is essential to be able to repair the data concerned with a minimum delay and with as little effect as possible on the rest of the system. For example the available data that has not been damaged should not be affected in any way. The DBA must define and implement an appropriate recovery strategy; which may involve a periodic dumping of the data base to a back up tape and procedures for reloading the relevant portion of the data base from the latest tapes.
6) MONITORING PERFORMANCE AND RESPONDING TO CHANGES IN REQUIREMENTS: The DBA is responsible for arranging the system so as to get the performance that is best for the enterprise and for making appropriate adjustments as requirements changes. Any change to the details of storage and access must be accompanied by a corresponding change to the definition of mapping to the storage so that the conceptual schema may remain constant.
The DBA will require a number of utility programs to do this task:
i) Loading routines (to create the initial version of the data base)
ii) Reorganizing routines (to rearrange the data base to reclaim the space occupied by obsolete data)
iii) Journalizing routine (to note each operation against the database)
iv) Recovery routines (to restore the data base to an earliest state after a failure)
v) Statistical analysis routine (to assist in monitoring performance)
Utility programs may be thought of as special system supplied applications, except the journalizing routines which must be part of DBMS

CLASSIFICATION OF USERS
DBA
APPLICATION PROGRAMMERS
DBMS System Programmer
DBMS Language Programmer
Conventional Application Programmer
USERS
General User
Parametric User
Casual User

CASUAL USER
Interacts with the system irregularly and occasionally. They are untrained with respect to the system they use.
PARAMETRIC USER
Interacts with the system by invoking a predefined procedure. The parametric user often interacts with the system from an online terminal. Data entry may be possible for the parametric user.
GENERAL USER
Interfaces directly with a generalized module of the DBMS for retrieval or update. The user should know some query language statements.
CONVENTIONAL APPLICATION PROGRAMMER
Writes programs in a conventional programming language such as COBOL, C, etc. It is characterized by a low level one-record-at-a-time approach to access and manipulate data files. They can manipulate data, beyond what is provided by the DBMS.
DBMS LANGUAGE PROGRAMMER
Uses the self contained language of the DBMS. Command programs for specific application will be written by the DBMS language programmer.
DBMS SYSTEM PROGRAMMER
Develops generalized system program to interface with the DBMS. Application programmers focus on the application in the organization while system programmers write generalized procedures to operate on any part of the data base in conjunction with or in association with application programs.

KTU FOSS LAB - Dr Binu V P

Search This Blog

Data Base Schema and Architecture

Comments

Post a Comment

Popular posts from this blog

Writing a Bash Shell Script

Different syntax for writing arithmetic expressions in bash shell

Flow Control in shell script - if and case statements