AKIN Advanced Data Analytics & Profiling

For licensing inquiries, please contact info@grappledata.com using a valid company/organization email address.

The Data Analysis & Profiling API was created to allow businesses and organizations to harness AKIN's power to inspect and analyze their data in order to get a handle on the state and quality of the data, identifying key issues that may exist in order to support data quality and cleansing projects.

Key features of this API include the ability to:
  • Provide General Statistics for database tables and/or views
    • Record Counts
    • Attribute/Field Counts
    • Field Value Sizes
      • Max value size with specific record reference
      • Average value size
  • Identify unused fields in database tables tables and/or views
  • Identify fields that are mostly unused
  • Identify attributes/fields that are common to multiple or all tables
  • Analyze Distributions of values within table/view fields
    • Values that appear very infrequently when most of the field values are used more often. These are more unique values that diverge from the general pattern of value usage in the table field.
    • Values that appear very frequently when most values are used only very infrequently
    • Identify records having field values that are very large compared to what is average or normal for the field
    • Identify records having field values that are smaller that what is average or normal for the field
    • Identify the most common data type that is represented within string/text value fields
    • Detect string/text values in a field that appear to represent a data type that is not normal or common for the field.
  • Detect duplicate values (Fuzzy Pattern Recognition)
    • Clustering of likely duplicate values
    • Results exported to csv files
    • Import of final reviewed csv which is used to auto-gen record/value update scripts in various formats
  • Duplicate Entity Detection (Fuzzy Pattern Recognition)
    • Identify duplicate entities that are defined within one table/view, or that span multiple tables/views
    • Ability to apply generic entity assessment Rules or our domain/entity type specific assessment rules
  • Supports scanning all tables/views or selective scan of specific tables/views
  • Allows exclusion of specific attributes/fields