Serialization of data structures in CSV/JSON/XML
Introduction
The Util.Serialize
package provides a customizable
framework to serialize and de-serialize data structures in CSV, JSON and XML.
It is inspired from the Java XStream library.
Record Mapping
The serialization relies on a mapping that must be provided for each data structure that must be read. Basically, it consists in writing an enum type, a procedure and instantiating a mapping package. Let's assume we have a record declared as follows:
type Address is record
City : Unbounded_String;
Street : Unbounded_String;
Country : Unbounded_String;
Zip : Natural;
end record;
The enum type shall define one value for each record member that has to be serialized/deserialized.
type Address_Fields is (FIELD_CITY, FIELD_STREET, FIELD_COUNTRY, FIELD_ZIP);
The de-serialization uses a specific procedure to fill the record member. The procedure that must be written is in charge of writing one field in the record. For that it gets the record as an in out parameter, the field identification and the value.
procedure Set_Member (Addr : in out Address;
Field : in Address_Fields;
Value : in Util.Beans.Objects.Object) is
begin
case Field is
when FIELD_CITY =>
Addr.City := To_Unbounded_String (Value);
when FIELD_STREET =>
Addr.Street := To_Unbounded_String (Value);
when FIELD_COUNTRY =>
Addr.Country := To_Unbounded_String (Value);
when FIELD_ZIP =>
Addr.Zip := To_Integer (Value);
end case;
end Set_Member;
The procedure will be called by the CSV, JSON or XML reader when a field is recognized.
The serialization to JSON or XML needs a function that returns the field value from the record value and the field identification. The value is returned as a Util.Beans.Objects.Object type which can hold a string, a wide wide string, a boolean, a date, an integer or a float.
function Get_Member (Addr : in Address;
Field : in Address_Fields) return Util.Beans.Objects.Object is
begin
case Field is
when FIELD_CITY =>
return Util.Beans.Objects.To_Object (Addr.City);
when FIELD_STREET =>
return Util.Beans.Objects.To_Object (Addr.Street);
when FIELD_COUNTRY =>
return Util.Beans.Objects.To_Object (Addr.Country);
when FIELD_ZIP =>
return Util.Beans.Objects.To_Object (Addr.Zip);
end case;
end Get_Member;
A mapping package has to be instantiated to provide the necessary glue to tie the set procedure to the framework.
package Address_Mapper is
new Util.Serialize.Mappers.Record_Mapper
(Element_Type => Address,
Element_Type_Access => Address_Access,
Fields => Address_Fields,
Set_Member => Set_Member);
Note: a bug in the gcc compiler does not allow to specify the !Get_Member function in the generic package. As a work-arround, the function must be associated with the mapping using the Bind procedure.
Mapping Definition
The mapping package defines a Mapper
type which holds the
mapping definition. The mapping definition tells a mapper what name correspond to the different fields.
It is possible to define several mappings
for the same record type. The mapper object is declared as follows:
Address_Mapping : Address_Mapper.Mapper;
Then, each field is bound to a name as follows:
Address_Mapping.Add_Mapping ("city", FIELD_CITY);
Address_Mapping.Add_Mapping ("street", FIELD_STREET);
Address_Mapping.Add_Mapping ("country", FIELD_COUNTRY);
Address_Mapping.Add_Mapping ("zip", FIELD_ZIP);
Once initialized, the same mapper can be used read several files in several threads at the same time (the mapper is only read by the JSON/XML parsers).
De-serialization
To de-serialize a JSON object, a parser object is created and one or several mappings are defined:
Reader : Util.Serialize.IO.JSON.Parser;
...
Reader.Add_Mapping ("address", Address_Mapping'Access);
For an XML de-serialize, we just have to use another parser:
Reader : Util.Serialize.IO.XML.Parser;
...
Reader.Add_Mapping ("address", Address_Mapping'Access);
For a CSV de-serialize, we just have to use another parser:
Reader : Util.Serialize.IO.CSV.Parser;
...
Reader.Add_Mapping ("", Address_Mapping'Access);
The next step is to indicate the object that the de-serialization will write into. For this, the generic package provided the !Set_Context
procedure
to register the root object that will be
filled according to the mapping.
Addr : aliased Address;
...
Address_Mapper.Set_Context (Reader, Addr'Access);
The Parse
procedure parses a file using a CSV, JSON or XML parser. It uses the mappings registered by Add_Mapping
and fills the objects registered by Set_Context
. When the parsing is successful, the Addr
object will hold the values.
Reader.Parse (File);
Parser Specificities
XML
XML has attributes and entities both of them being associated with a name.
For the mapping, to specify that a value is stored in an XML attribute, the
name must be prefixed by the @ sign (this is very close to an XPath expression). For example if the city
XML entity has an id
attribute,
we could map it to a field FIELD_CITY_ID
as follows:
Address_Mapping.Add_Mapping ("city/@id", FIELD_CITY_ID);
CSV
A CSV file is flat and each row is assumed to contain the same kind of entities. By default the CSV file contains as first row a column header which is used
by the de-serialization to make the column field association. The mapping
defined through Add_Mapping
uses the column header name to indicate
which column correspond to which field.
If a CSV file does not contain a column header, the mapping must be created by using the default column header names (Ex: A, B, C, ..., AA, AB, ...). The parser must be told about this lack of column header:
Parser.Set_Default_Headers;