Fuzzy Content Search is the ability to search for similar words and phrases inside
documents and emails in a way that doesn't require you to remember exactly how
something was spelled or written. If you are a business professional that
frequently requires immediate access to lots of information, this feature can save a
significant amount of time and energy, making you a stronger, more productive, and
more effective member of your projects and teams.
There are other desktop search engines that search the contents of your documents
and emails. For example, the native Windows Search and Microsoft Outlook Search allow you to
do this. However, when you can't remember the exact way something was written or
spelled, these search engines cannot produce the necessary search results. On
the other hand, AKIN goes deep to find the documents and emails that contain
the most similar words and phrases. Additionally, AKIN scores those documents
based on both the similarity of the words found, as well as their proximity to
one another, or how they match the search phrase entered. This produces more
sophisticated and relevant results.
Instead of getting a list of every document containing
the search words you entered without any sort of ranking, AKIN gives
you a nicely sorted list with the items most similar and relevant to your search
appearing at the top of the results. This means you don't have to scan over the
entire list opening and closing documents to look for the most likely candidate.
Additionally, AKIN provides a powerful preview of items you find via content search.
This preview does an even deeper and more exhaustive search of your document and lists
the most similar occurrences in order adjusting for relevance using word proximity and
the order the text was found.
How are Fuzzy Content Search results scored?
AKIN looks for the words that are most similar to your search in the contents of
the item. The similarity of those words produces an initial score between
0-100%. The number of times those words are actually used in the document will
then contribute to increasing the score slightly. For this reason, if it found
exact matches for your search words/terms, the score may show as slightly above
100%. In addition to scoring for similarity, AKIN can score based on word
proximity. When AKIN scores using word proximity, a number of heuristics are
used to determine which results are likely most relevant and these results are
pushed to the top of the results list.
Are there any limitations to the AKIN Fuzzy Content Search?
When a search engine is more intelligent and it needs to index more information,
it must use more memory to store that information. AKIN search indexes are
loaded into RAM (Random Access Memory) for drastically improved performance.
Each user of AKIN will have a unique desktop or laptop computer system
configured with more or less RAM. For this reason, AKIN Fuzzy Content Search
has been designed to give the user control over what items are indexed and how
much of each item type's content is indexed.
For example, the user can decide only to index the content of files and not
email, to index only the top 10,000 emails, or the user can decide to index only
the top 300 words from each document, etc.. Also, proximity detection and
scoring uses more memory
than non-proximity detection. The user can choose to turn this on or off
depending on their system's resources.
Since the number of emails recieved can grow quite large in proportion to the
total number of relevant items, and relevant email content is usually found
within the first few paragraphs of an email, AKIN has a default limit of 150 words
for Fuzzy Content Search indexing of emails, which the user can increase or set
to unlimited. We suggest that indexing more than
300 words of emails is usually only useful for power users like Attorneys and
legal professionals. However, in the Fuzzy Content
Search settings the user has the option of increasing this number or setting it
to "0" in order to index all the words in the emails. It is completely up to the
user how much memory they want to dedicate to the task, and there will be a
trade-off between functionality and the resources available for any particular
user. Today, RAM is relatively inexpensive and most modern systems come
installed with 6-8 GB of RAM, which should be more than enough to index the
contents of at least a couple hundred thousand items (or more) without
degrading system performance, depending
on how many words each item contains on average.
Microsoft Excel files also represent a potential source that can cause the index
size in memory to expand quickly due to the large number of potentially unique
values that can occur in cell fields (especially numeric values). For this
reason, AKIN does not index numeric values for Fuzzy Content Search. Also, by
default, it only indexes the first worksheet in the Excel file and the first 10
rows containing data. However, there is a setting the user can adjust to
determine how many rows out of Excel files to read (a specific number or all
rows). This allows the user to control how much memory is used up in the
indexing of Excel files. Also, remember that if you want to search for specific
numeric values (in a non-fuzzy exact way), you can still do so since AKIN
forwards your query to the Windows Search Service and aggregates those results
to its own and presents them to the user.
Words that are very frequently used in documents are seen as having little
to no informational relevance and are generally ignored. Additionally, due to the highly
unique nature of numeric values, which can lead to the index quickly using up
more memory, numeric values are currently ignored. However, there is an explicit
inclusion list within AKIN that allows some common numeric identifier values to
be indexed. For example, "win7" or "win8" are values that contain numerics that
AKIN will explicitly include in its Fuzzy Content index. Future updates to AKIN will
expand these inclusion lists as necessary, and also potentially include a setting option
allowing the user to add their own values to this inclusion list as necessary.
However, as mentioned in an earlier paragraph above, when AKIN opens a preview of any document found,
it performs an even more exhaustive search through the entire document to find the most relevant matches
and is not limited. You can also use this deep document search tool (the button labeled "DS"), to open
ANY document at any time you wish to search through using AKIN Fuzzy Content Search. Just use the "Browse"
button to browse to any file containing text that AKIN is able to open and read.