If the amount information to be stored is small and structured, a simple flat text file representation will suffice. For large amounts of numerical data, binary files are required and the format is usually application specific.
Occasionally, one want to save and restore a representation of the set of objects in an OOP program. This is a little more difficult, since the object can contain references to each other. These references are memory addresses and will not be stable from one instance of the program to another. The solution to this problem is straightforward (if tedious). One essentially builds a dictionary or array of all the objects to be dumped, convert all references to indexes or keys in this dictionary. the dump both the dictionary and object contents (with now contain only basic types). To reverse the process, one reads the datafile, building the dictionary and creating the objects, the converts the indexes (or keys) back into references. The serialization and object stream features of Java do this automatically for you. In languages such as C, it must be done by hand.
There is a full course on databases later in the year, and the Web course will touch on them also. Here we will introduce them and focus on the relationship between databases and object-oriented programming.
Most database programs provide at least 4 functions.
What is missing from databases that we have in OOP? Primarily inheritance, (which can be implemented by hand), methods, and polymorphism. Some attempts have been made to bring the database model even closer to the object model (ie object-oriented databases), but these have not yet replaced the relational model.
You will learn SQL in detail in your database course. Here is an example to give you the basic idea.
SELECT FirstName,LastName,Salary from EMPLOYEE where (Salary > 10000) and (FirstName == 'Sam');When executed, this will return a subset of the EMPLOYEE table having only the columnsFirstName,LastNameand ,Salaryand whose rows match the conditions in thewhereclause. If type at a database console app, this will print back something like:FirstName LastName Salary ---------------------------------- Sam Nunn 12000 Sam Digita 20000There are many powerful features of SQL, but this is the basic idea.Synchronization and Robustness
This is what really pay the big bucks for in a database program. The previous two features are fairly straightforward to implement (also query optimization is a challenge). These last two are much more difficult.We saw a little of the problem when we discussed thread synchronization. Say we have two operations (these would actually be SQL statements) on an account table, Deposit(account, amount), and Withdraw(account, amount) in an environment where many user are reading and writing the database. Say we also want to do transfers which Withdraw from one account , then deposit into another. As in the case of threads, we need synchronization operations to ensure the update operations are thread safe. But we actually need much stronger guarantees. We need transaction support.
Transactions
A transaction is a sequence of operation considered as a unit for which a certain set of conditions is guaranteed. These conditions are often referred to as the ACID conditions (or the ACID tests), based on their names.
The Java mechanism for accessing remote databases is called JDBC ( and is more or less Java's answer to ODBC). While we won't into details here (Volume2 of Core Java gives an introduction, there are several books that cover the subject completely). Any database programming interface (like JDBC) system must support the following operations:
XML attempts to correct some of the weaknesses in HTML in that it's goal is to more clearly separate the concepts of data description and data formatting and presentation. A text document in XML would be entirely described in terms of document components (Chapters, sections, headings, tables, footnotes, etc). An entirely different document, usually called a style sheet would describe (often, but not always, also in XML) how to format these structures for a particular display technology. Structured data, for example the result of a database query, would also be described in terms of its component data, separated from any display and presentation information.
The syntax of XML languages is a particularly simple. Like HTML, XML documents consist of text plus markup elements describing the structure of the text. The elements are contained in angle brackets. (Which means that angle brackets in XML text must be escaped so they do not appear as markup. A hassle with writing XML lectures in HTML.). Internally the elements contain a tag and one or more attribute-value pairs. Example XML tags:
<picture src="url1" name="myname">Unlike HTML, the values must be given and enclosed in "". Also unlike HTML each tag must have a corresponding close tag. Also, all element names and attributes are case-sensitive.
</picture>The open and close tags must match in stack order.
Inside any element (in between the open and close tags) one can have text data with additional markup. The intent is that this data is somehow describe by the enclosing tags. There is a special syntax for tags with no enclosed text or data.
<picture src="url1" name="myname"/>This is about it. Clearly this system can be used for describing the components of text documents (books, articles, etc). It can also be used to describe structed data as in out DB example above
<EmployeeList> <Employee> <FirstName>Sam</FirstName> <LastName>Nunn</LastName> <Salary>12000</Salary> </Employee> <Employee> <FirstName>Sam</FirstName> <LastName>Digita</LastName> <Salary>20000</Salary> </Employee> </EmployeeList>
The syntax of DTDs and Schema is tedious, although not difficult, so will not present examples here. There are many books on XML that cover this in great detail.
There are two standard interfaces for parsing and manipulating XML in programs: SAX and DOM. SAX is an event-based model the processes the document serially. It allows the programmer to override processing methods that get called each time a begin or end tag is encountered. The DOM parser produces a tree structure in memory representing the document and provides an interface for accessing the this tree structure (which looks like standard tree methods, with addition methods for accessing attributes). DOM, by the way, stands for Document Object Model. There are SAX and DOM libraries for C++ and Java available from Apache.com Java also has a simplified version of DOM called JDOM.
There are several technologies for specifying and performing these transformations. The older, and still current, technologies are the style sheet systems that grew up around HTML, particularly CSS (Cascading Style Sheets). These associate with each element type (possibly in context) a display format (font, color, etc). Another system called DSSSL, is used by the popular DocBook text document description language (XML compatible).
Document transformations can be accomplished entirely with the XML milieu using XSL, and XSLT. XSL is an XML compatible type sheet and tree transformation language. XSLT is a translator that takes an XML document and an XSL style sheet and produces as new XML document. Associated with XSL is a display formatting language called XML:FO which can be the final output of such a transformation, though few systems now support display of XSL:FO natively. The beauty of the all XML scheme is that both the XSL style sheet and the original XML document can undergo multiple levels of transformation before final re-transmission or display.
There is much more to be read about XML, much of it standards and details. I refer you to the many books on XML, XML and Java, and to the Apache Web site where a lot of interesting open source work is going on in connection with Java and XML.