Jump to content

Data exchange

From Wikipedia, the free encyclopedia

Data exchangeis the process of taking data structured under asourceschemaand transforming it into atargetschema, so that the target data is an accurate representation of the source data.[1]Data exchange allows data to be shared between different computer programs.

It is similar to the related concept ofdata integrationexcept that data is actually restructured (with possible loss of content) in data exchange. There may be no way to transform aninstancegiven all of the constraints. Conversely, there may be numerous ways to transform the instance (possibly infinitely many), in which case a "best" choice of solutions has to be identified and justified.

Single-domain data exchange[edit]

In some domains, a few dozen different source and target schema (proprietary data formats) may exist. An "exchange" or "interchange format" is often developed for a single domain, and then necessary routines (mappings) are written to (indirectly) transform/translate each and every source schema to each and every target schema by using the interchange format as an intermediate step.[2]That requires a lot less work than writing and debugging the hundreds of different routines that would be required to directly translate each and every source schema directly to each and every target schema.

Examples of these transformative interchange formats include:

Data exchange methods[edit]

There are two types of data exchange: broadcast data exchange vs peer-to-peer (unicast) data exchange.[9]

In a broadcast network, data is transmitted simultaneously to all participants. Just as a conference call, all participants get the exact same information from the speaker at the same time.[10]

In a peer-to-peer (unicast) data exchange model, data is sent only to the targeted receiver defined by a specific address. Just as a telephone call or a email, information only flows between two network participants.[11]

Data exchange languages[edit]

A data interchange (or exchange) language/format is a language that is domain-independent and can be used for data from any kind of discipline.[12]They have "evolved from being markup and display-oriented to further support the encoding of metadata that describes the structural attributes of the information."[13]

Practice has shown that certain types offormal languagesare better suited for this task than others, since their specification is driven by a formal process instead of particular software implementation needs. For example,XMLis amarkup languagethat was designed to enable the creation of dialects (the definition of domain-specific sublanguages).[14]However, it does not contain domain-specific dictionaries or fact types. Beneficial to a reliable data exchange is the availability of standard dictionaries-taxonomies and tools libraries such asparsers,schemavalidators,and transformation tools.[citation needed]

Popular languages used for data exchange[edit]

The following is a partial list of popular generic languages used for data exchange in multiple domains.


Name/Abbreviation Schemas Flexible Semantic verification Dictionary Information Model Synonyms and homonyms Dialecting Web standard Transformations Lightweight Human readable Compatibility
RDF Yes[1] Yes Yes Yes Yes Yes Yes Yes Yes Yes Partial Subset ofSemantic web
XML Yes[2] Yes No No No No Yes Yes Yes No Yes subset ofSGML,HTML
Atom Yes Un­known Un­known Un­known No Un­known Yes Yes Yes No No XMLdialect
JSON No Un­known Un­known Un­known No Un­known No Yes No Yes Yes subset ofYAML
YAML No[3] Un­known Un­known Un­known No Un­known No No No[3] Yes Yes[4] superset ofJSON
REBOL Yes[7] Yes No Yes No Yes Yes No Yes[7] Yes Yes[5]
Gellish Yes Yes Yes Yes[8] No Yes Yes ISO No Yes Partial[6] SQL, RDF/XML, OWL

Nomenclature

  • Schemas – Whether the language definition is available in a computer interpretable form
  • Flexible – Whether the language enables extension of the semantic expression capabilities without modifying the schema
  • Semantic verification – Whether the language definition enables semantic verification of the correctness of expressions in the language
  • Dictionary-Taxonomy – Whether the language includes a dictionary and a taxonomy (subtype-supertype hierarchy) of concepts with inheritance
  • Synonyms and homonyms – Whether the language includes and supports the use of synonyms and homonyms in the expressions
  • Dialecting – Whether the language definition is available in multiple natural languages or dialects
  • Web or ISO standard – Organization that endorsed the language as a standard
  • Transformations – Whether the language includes a translation to other standards
  • Lightweight – Whether a lightweight version is available, in addition to a full version
  • Human-readable – Whether expressions in the language arehuman-readable(readable by humans without training)[15]
  • Compatibility – Which other tools are possible to use or required when using the language

Notes:

  1. ^RDF is a schema-flexible language.
  2. ^The schema of XML contains a very limited grammar and vocabulary.
  3. ^Available as an extension.
  4. ^In the default format, not the compact syntax.
  5. ^The syntax is fairly simple (the language was designed to be human-readable); the dialects may requiredomain knowledge.
  6. ^The standardized fact types are denoted by standardized English phrases, which interpretation and use needs some training.
  7. ^TheParse dialectis used to specify, validate, and transform dialects.
  8. ^The English version includes a Gellish English Dictionary-Taxonomy that also includes standardized fact types (= kinds of relations).

XML for data exchange[edit]

The popularity ofXMLfor data exchange on theWorld Wide Webhas several reasons. First of all, it is closely related to the preexisting standardsStandard Generalized Markup Language(SGML) and Hypertext Markup Language (HTML), and as such a parser written to support these two languages can be easily extended to support XML as well. For example,XHTMLhas been defined as a format that is formal XML, but understood correctly by most (if not all) HTML parsers.[14]

YAML for data exchange[edit]

YAMLis a language that was designed to be human-readable (and as such to be easy to edit with any standard text editor). Its notion often is similar toreStructuredTextor a Wiki syntax, who also try to be readable both by humans and computers. YAML 1.2 also includes a shorthand notion that is compatible with JSON, and as such any JSON document is also valid YAML; this however does not hold the other way.[16]

REBOL for data exchange[edit]

REBOLis a language that was designed to be human-readable and easy to edit using any standard text editor. To achieve that it uses a simple free-form syntax with minimal punctuation and a rich set of datatypes. REBOL datatypes like URLs, emails, date and time values, tuples, strings, tags, etc. respect the common standards. REBOL is designed to not need any additional meta-language, being designed in a metacircular fashion. The metacircularity of the language is the reason why, e.g., the Parse dialect used (not exclusively) for definitions and transformations of REBOL dialects is also itself a dialect of REBOL.[17]REBOL was used as a source of inspiration for JSON.[18]

Gellish for data exchange[edit]

Gellish Englishis a formalized subset of natural English, which includes a simple grammar and a large extensibleEnglish Dictionary-Taxonomythat defines the general and domain specific terminology (terms for concepts), whereas the concepts are arranged in a subtype-supertype hierarchy (a taxonomy), which supports inheritance of knowledge and requirements. The Dictionary-Taxonomy also includes standardized fact types (also called relation types). The terms and relation types together can be used to create and interpret expressions of facts, knowledge, requirements and other information. Gellish can be used in combination withSQL,RDF/XML,OWLand various other meta-languages. The Gellish standard is a combination of ISO 10303-221 (AP221) and ISO 15926.[19]

See also[edit]

References[edit]

  1. ^A. Doan, A. Halevy, and Z. Ives. "Principles of data integration",Morgan Kaufmann,s 2012 pp. 276
  2. ^Arenas, M.; Barceló, P.; Libkin, L.; Murlak, F. (2014).Foundations of Data Exchange.Cambridge University Press. pp. 1–11.ISBN9781107016163.Retrieved25 May2018.
  3. ^Clancy, J.J. (2012). "Chapter 1: Directions for Engineering Data Exchange for Computer Aided Design and Manufacturing". In Wang, P.C.C. (ed.).Advances in CAD/CAM: Case Studies.Springer Science & Business Media. pp. 1–36.ISBN9781461328193.Retrieved25 May2018.
  4. ^Kalish, C.E.; Mayer, M.F. (November 1981). "DIF: A format for data exchange between application programs".BYTE Magazine:174.
  5. ^"About ODF".OpenDoc Society.Retrieved25 May2018.
  6. ^Zhu, X. (2016).GIS for Environmental Applications: A practical approach.Routledge.ISBN9781134094509.Retrieved25 May2018.
  7. ^"KML Reference".Google Inc. 21 January 2016.Retrieved25 May2018.
  8. ^Martins, R.M.F.; Lourenço, N.C.C.; Horta, N.C.G. (2012).Generating Analog IC Layouts with LAYGEN II.Springer Science & Business Media. p. 34.ISBN9783642331466.Retrieved25 May2018.
  9. ^Heidarzadeh, A.; Sprintson, A. (2017-03-30)."Optimal exchange of data over broadcast networks with adversaries".2016 Information Theory and Applications Workshop (ITA).ISBN978-1-5090-2529-9– via IEEE.
  10. ^"What is a Broadcast?".IONOS Digital Guide.2023-03-20.Retrieved2024-04-03.
  11. ^"Unicast".IONOS Digital Guide.2023-03-23.Retrieved2024-04-03.
  12. ^Billingsley, F.C. (1988)."General Data Interchange Language".ISPRS Archives.27(B3): 80–91.Retrieved25 May2018.The transformation routines will constitute a language and syntax which must be discipline and machine independent.
  13. ^Nurseitov, N.; Paulson, M.; Reynolds, R.; Izurieta, C. (2009). "Comparison of JSON and XML Data Interchange Formats: A Case Study".Scenario:157–162.
  14. ^abLewis, J.; Moscovitz, M. (2009).AdvancED CSS.APress. pp. 5–6.ISBN9781430219323.Retrieved25 May2018.
  15. ^"human-readable".Oxford Dictionaries.Oxford University Press. Archived fromthe originalon May 30, 2018.Retrieved29 May2018.
  16. ^Bendersky, E. (22 November 2008)."JSON is YAML, but YAML is not JSON".Eli Bendersky's website.Retrieved29 May2018.
  17. ^Sassenrath, C. (2000)."The REBOL Scripting Language".Dr. Dobb's Journal.25(314): 64–8.Retrieved29 May2018.
  18. ^Sassenrath, C. (13 December 2012)."On JSON and REBOL".REBOL.Retrieved29 May2018.
  19. ^van Renssen, A.; Vermaas, P.E.; Zwart, S.D. (2007)."A Taxonomy of Functions in Gellish English".Proceedings from the International Conference on Engineering Design 2007:DS42_P_230.Retrieved29 May2018.