-
Notifications
You must be signed in to change notification settings - Fork 36
conversion:charset
timrdf edited this page Jul 6, 2012
·
8 revisions
csv2rdf4lod-automation is licensed under the [Apache License, Version 2.0](/~https://github.com/timrdf/csv2rdf4lod-automation/wiki/License)
Structural conversion:Enhancements:
- conversion:charset - to specify the character encoding of the input file.
- conversion:HeaderRow - to specify the row that contains header data (or [dimensional values](Converting with cell based subjects)).
-
conversion:DataStartRow - to specify the first (inclusive) row that contains data.
- conversion:delimits_cell - to specify the character that terminates a cell.
- conversion:Only_if_column - to omit processing a row if a certain column's value is missing.
- conversion:Repeat_previous_if_empty_column - to "downfill" an empty cell with the value from above.
- conversion:repeat_previous - to specify a value that indicates repetition (instead of just an empty value).
- conversion:Omitted - to specify a column to omit.
- conversion:DataEndRow - to specify the last (inclusive) row that contains data.
csv2rdf4lod assumes the input character encoding is UTF-8
. If this is not the case, then conversion:charset
can be used to specify the appropriate character encoding. The values that you are likely to use are listed here:
-
US-ASCII
Seven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block of the Unicode character set -
ISO-8859-1
ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1 -
UTF-8
Eight-bit UCS Transformation Format -
UTF-16BE
Sixteen-bit UCS Transformation Format, big-endian byte order -
UTF-16LE
Sixteen-bit UCS Transformation Format, little-endian byte order -
UTF-16
Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark
<http://logd.tw.rpi.edu/source/lebot/dataset/replacement-characters-from-csv-api/version/2012-Jul-05/conversion/enhancement/1>
a conversion:LayerDataset, void:Dataset;
conversion:base_uri "http://logd.tw.rpi.edu"^^xsd:anyURI;
conversion:source_identifier "lebot";
conversion:dataset_identifier "replacement-characters-from-csv-api";
conversion:version_identifier "2012-Jul-05";
conversion:enhancement_identifier "1";
conversion:conversion_process [
a conversion:EnhancementConversionProcess;
conversion:enhancement_identifier "1";
dcterms:creator <http://purl.org/twc/id/machine/lebot/MacBookPro6_2#lebot>;
dcterms:created "2012-07-06T09:31:27-04:00"^^xsd:dateTime;
conversion:charset "ISO-8859-1"; # <- Add this to specify character encoding of the input CSV (default is UTF-8)
conversion:delimits_cell ",";
conversion:enhance [
ov:csvCol 1;
ov:csvHeader "Title";