This page documents the identification and record linkage application note. It is part of AgGateway's ADAPT Minimum Viable Product (MVP). The purpose of this application note is to document best practices for what to do when an ADAPT file contains an identifier that is not recognized by the system receiving it.
The purpose of this application note is to document best practices for what to do when an ADAPT file contains an identifier that is not recognized by the system receiving it.
Two or more agricultural businesses share some of the same customers and wish to exchange certain data in support of their mutual customers.
Contemporary farming requires the continuous exchange of information between growers and partners such as agronomists, retailers, custom applicators, insurance agents and customers. A critical part of this interoperability is identification, where a name or code (the “identifier”) is used to reference a particular instance of a data object. This allows Farm Management Information Systems (FMIS) to distinguish that unique instance from others, and to recognize that instance when running into it again. It also enables Machine and Implement Control Systems (MICS) to keep track of what products to apply, and where to apply them.
There are many motivations for uniquely (and unambiguously) identifying resources in production agriculture data exchange. Examples include specifying the products being applied (or planned for application) in a particular field operation; specifying the location(s) where these products are applied; and enabling an audit trail (the aspiration of farm-to-fork traceability) for field operations processes.
Centralized approaches to identification are often used in agriculture: supply-chain operations increasingly use GTINs (Global Trade Item Numbers), GLNs (Global Location Numbers), and EANs (International Article Numbers), all codes minted by one or more numbering authorities. This makes clear where the identifier originated (its “source”) and what its meaning is. However, this approach doesn't currently work in field operations. A grower might use thousands of identifiers to name their farms, fields and documents. Those identifiers may be needed in situations with no Internet connectivity, and paying for identifiers is counter-intuitive to many end-users. The distributed minting of identifiers is the reality in field operations, as the various FMIS and MICS solutions in the marketplace typically create their own identifiers. An additional twist is there is little format standardization among these identifiers: different FMIS and MICS manufacturers use a variety of identifier data types, such as integers, GUIDs (Globally Unique Identifiers), URIs (Universal Resource Identifiers), and proprietary string hashes.
Workflows involving identification often break down when a grower (or other actor) imports data into their FMIS from an external MICS or FMIS. Incoming data (that may correspond to objects such as farms, fields, and products in the grower’s own FMIS) might use externally-minted identifiers that the grower’s FMIS does not recognize, and conceivably in a format that does not match the one used by the FMIS. The user must then manually match these unknown identifiers with known objects in their system. (The process can be supported by spatial overlap checking, string comparison metrics, and so forth). Users do not like this data mapping (also called record linkage, and object identification), because it's time-consuming and error-prone, and is ultimately an obstacle to the broader adoption of precision ag technologies. This is especially true as users increasingly have an expectation of friction-less data entry.
This note advocates mostly for users to avoid record linkage altogether by sharing identifiers across the ecosystem. Some notes are added about how to resolve record linkage problems when they are unavoidable.
Figure 1: The CompoundIdentifier
Farm adaptFarm = new Farm(); ourId = new UniqueId(); ourId.Id = (string)farm["id"]; ourId.IdType = IdTypeEnum.UUID; ourId.Source = "http://www.ourcompany.com"; ourId.SourceType = IdSourceTypeEnum.URI; adaptFarm.Id.UniqueIds.Add(ourId);
The best way to deal with record linkage is to avoid it altogether. The solution proposed by the ADAPT team has two components: one technical, the other social.
The technical component:
The social component:
Even when there is perfect sharing of IDs across the industry, when an identifier first comes in "from the wild" (for example, because it was created in the cab of a machine) the first FMIS that consumes it has to solve the record linkage problem.
This is a topic where different companies can compete with sophisticated proprietary solutions to make the user experience as friction-less as possible, and fortunately there is an abundance of scientific literature on the topic, primarily from the field of health care. Here are a couple of tips:
See the full TreeDemo example in the ADAPT Sample Code (https://github.com/ADAPT/ADAPT-Samples).
Understanding the CompundIdentifier class is critical to understanding how the elements in the ADAPT ApplicationDataModel relate to each other and relate to the various applications contributing to their values.
Review the ADAPT toolkit documentation
Review User Stories and Application Notes:
Read a 2-page flyer on ADAPT
Helpful Videos
Letter of Support from Grower Organizations
Find more resources on Open-Source ADAPT site: AdaptFramework.org
Adapt.Feedback@AgGateway.org
For more information, including materials for joining ADAPT, visit: http://www.AdaptFramework.org
ADAPT Wiki
You must be a member to access some areas of the Wiki. Contact Member Services at Member.Services@AgGateway.org for more information.
Chairs Dan Danford Case New Holland, (Business)
Stuart Rhea Ag Connections (Technical)
Vice-Chair Kelly Nelson TopCon (Technical)