Format Converters
We have already seen an example on how a converter can be set up. Let's take a step back and look at
the declaration of the function that adds a converter, HTConversion_add(...)
:
extern void HTConversion_add (HTList * conversions, CONST char * rep_in, CONST char * rep_out, HTConverter * converter, double quality, double secs, double secs_per_byte);The first argument is a list object. List objects are one of the several container objects in the Library and they are explained in more details in the W3C Library Internals. All we have to know at this point is to create a list object:
extern HTList * HTList_new (void);The two next arguments describes the input format and the output format of the data that is entering and leaving the converter respectively. The syntax for these formats follow the syntax defined by the HTTP Protocol and the MIME specification which has a type string and a subtype string separated by a slash "/"
<type> "/" <subtype>Some of the most common examples are
text/plain text/html image/gif audio/basic */*In addition to these "official" MIME types, the Library has a small set of internal representations that uniquely exist within the Library. They are used to describe data formats that are not really formats but an intermediate state of the document. The two most used formats of this type are
www/present www/unknownThe internal formats are characterized by having the type www which doesn't exist anywhere but in the Library. The first of the two subtypes shown represent the rendered document as presented to the user and the second subtype represents an unknown data format.
The converter
argument is a pointer to the function that is to be called in order to
create a converter object capable of handling the conversion from the input type to the output
type. By registering a pointer pointing to the converter, the converter can be set up dynamically.
This allows the Library to evaluate the set of registered converters each time a conversion is
requested and then chose the best suitable converter on the fly.
The next argument is the quality factor which we will describe in a separate paragraph later in this chapter. The last two arguments are not currently used but are reserved for future use. For now, using a value of 0 is perfectly valid.
Converters are intended to be used when we have our own module to handle the data coming from the remote server. The module can either be one provided by the Library or one made by the application. However, in some cases we would rather hand off the data to an external application for presenting the data. Often external applications are viewers of some sort, for example a postscript viewer or a mpeg viewer. The Library lets us register external applications as presenters very much like converters. This will become obvious if we take a look at how we register presenters:
extern void HTPresentation_add (HTList * conversions, CONST char * representation, CONST char * command, CONST char * test_command, double quality, double secs, double secs_per_byte);As was the case with converters, the first argument is a list which we create in exactly the same way as shown before. Presenters only need a input format as we hand off the data to the external application and never sees it again. A special thing about presenters and converters is that as they are very similar they are also treated very much alike internally in the Library. Therefore a list object can contain both converters and presenters at the same time. This makes often the management easier for the application instead of having to deal with two separate lists.
The next field is reserved to be used in connection with mail cap parsers as the test field of a mail cap file. The Library does not yet directly support Mail Cap files but the registration of presenters is foreseen to be able to work with mail cap files. The Arena browser is an example of an application having its own Mail Cap file parser while using the Library. The description of the test field in RFC 1524 is included below:
The "test" field may be used to test some external condition (e.g., the machine architecture, or the window system in use) to determine whether or not the mail cap line applies. It specifies a program to be run to test some condition. The semantics of execution and of the value returned by the test program are operating system dependent, with UNIX semantics specified in Appendix A. If the test fails, a subsequent mail cap entry should be sought. Multiple test fields are not permitted -- since a test can call a program, it can already be arbitrarily complex.
The last three arguments are exactly identical to the conversion registration so there is no need to describe them any more here. Again, the quality factor will be described in details later in this chapter.
extern void HTLanguage_add (HTList * list, CONST char * lang, double quality);The list object containing the set of natural languages is similar to the list elements containing the converters and the presenters. However, in contrast to the former two which actually can be one list, the list of natural languages must be a list on its own.
The semantics of the language argument follows closely the Language tag of the HTTP protocol which in terms is based on the RFC 1766. Some example tags are
en en-US en-cockney i-cherokee x-pig-latinwhere any two-letter primary tag is n ISO 639 language abbreviation and any two-letter initial subtag in an ISO 3166 country code.
extern void HTEncoding_add (HTList * list, CONST char * encoding, double quality);The list argument is the now well-known way of handling these preferences and we will see this many more times throughout the guide. The "encoding" argument is a constant string just like the data format descriptions in the registration of converters and presenters. The values are also inspired strongly by the HTTP Protocol and the MIME specification and some of the most common examples are:
base64 compress gzipAs the list of natural languages, the list of encoders and decoder must be a separate list.
extern void HTCharset_add (HTList * list, CONST char * charset, double quality);Also the charset argument is inspired by the HTTP Protocol and the MIME specification. Some of the most common examples of the charset parameter is:
US-ASCII ISO-8859-1 UNICODE-1-1Again, the list of preferred character sets must be a separate list.
It is a bit different for converters where it is often the application's ability of handling the data format rather than the user's perception. As an example it is often faster to use a converter than a presenter as it takes time to launch the external application and the Library can not use progressive display mechanisms which is often the case for converters. Therefore, as an example, if we capable of handling an image in png format inline but rely on an external viewer for presenting postscript, we might set up the following list:
HTConversion_add (converters, "image/gif", "www/present", GifPresenter, 1.0, 0.0, 0.0); HTPresentation_add (presenters, "application/postscript", "ghostview %s", NULL, 0.5, 0.0, 0.0);where the gif converter is registered with a quality factor of 1.0 and the postscript presenter with a quality factor of 0.5.
Here we will only show how to enable the preferences globally. Later when we have discussed how to create a request object we will see how to enable the preferences locally and also if they are to be added to the global list or completely override the global list for a particular request.
extern void HTFormat_setConversion (HTList *list); extern HTList * HTFormat_conversion (void);
extern void HTFormat_setEncoding (HTList *list); extern HTList * HTFormat_encoding (void);
extern void HTFormat_setLanguage (HTList *list); extern HTList * HTFormat_language (void);
extern void HTFormat_setCharset (HTList *list); extern HTList * HTFormat_charset (void);
Common for the cleanup methods is that when they have been called you can nor more use the lists as they are not pointing to valid places in the memory. The first mechanism for cleaning up lists is by calling the cleanup method of each preference as indicated below:
extern void HTConversion_deleteAll (HTList * list);
extern void HTPresentation_deleteAll (HTList * list);
extern void HTEncoding_deleteAll (HTList * list);
extern void HTLanguage_deleteAll (HTList * list);
extern void HTCharset_deleteAll (HTList * list);The second mechanism which at once cleans up all globally registered preferences can often be used in order to simplify the management done by the application. Note, however, that all globally lists become inaccessible for future reference. In you want to define new sets of preferences then you need to start all over again and create a new list object.
extern void HTFormat_deleteAll (void);
extern void HTConverterInit (HTList * conversions);There is a similar function for registering a common set of presenters that can be found on many (especially Unix) platforms:
extern void HTPresenterInit (HTList * conversions);In order to show the similarity between how converters and presenters are handled in the Library, there is also a single function that does the work of the two previous functions at once:
extern void HTFormatInit (HTList * conversions);