PinPoint
PinPoint is a visual workflow tool that makes it easy to
create and run high-performance data cleansing and
transformation processes. Using a "Tinker Toy®" approach,
you select components from a palette of tools. Each tool
performs a specialized operation. By configuring and
connecting these tools, you create a custom
data-transformation project. Running your project is as
simple as clicking a button. The results of your job can be
stored in a database or viewed in a spreadsheet or map
application.
The projects that you build using PinPoint resemble a
flowchart: records "flow" from the inputs, through the
tools, and to output. Each tool performs a simple function,
such as sorting, filtering, geocoding or joining. By
connecting the simple tools, you can create complex projects
tailored to meet your specific needs.
PinPoint supports a variety of input and output data
formats. For high-performance, flat files, DBF files, and
delimited value files are the best choice. For universal
access, PinPoint supports ODBC to a variety of RDBMS'
including Oracle, Microsoft SQL Server and DB2.
PinPoint Library
PinPoint is both an application driven via a Graphical User
Interface (GUI) and a library that implements the processing
tools. After building a custom data-processing project from
the GUI, you can export your project and embed it in your
own application. Your application calls the PinPoint library
at run-time to load your project definition and process
data. You can bring PinPoint's high performance and
flexibility to your own vertical-market applications.
Why PinPoint?
Spatial CRM - In addition to traditional data
cleansing and transformation processes, PinPoint is the
world's first tool that adds spatial intelligence to your
database. Adding geographic coordinates to your customer
files, appending spatial layers to prospect files or merging
polygons is as simple as dragging the tool onto the canvas
and pointing to the inbound and outbound files. PinPoint
reads and writes spatial formats including .tab, .mid/.mif,
.shp and .sdf to support popular GIS packages.
Performance - PinPoint is optimized to process an
entire data set, rather than transactions or queries. As a
result, it is hundreds of times faster than
transaction-oriented databases, even those running on
expensive hardware. PinPoint is written in C++, one of the
fastest programming languages available today.
The real key to PinPoint's superior performance is
sequential disk access. Sequential access means that records
are read in the order in which they are stored on disk.
Sequential access is fast because data is read in large
blocks. Random access, by contrast, reads each record from a
location that is appropriate for the software, but which
does not bear any relation to the physical storage on disk.
Random access is slow because the disk head must move to
access each record.
You've probably heard of "Moore's Law", which states that
the speed and capacity of computer systems double every year
or two. This exponential growth in processing power has
driven the amazing advances in computer technology. Over the
years, the increase in performance of computer systems has
followed Moore's law in every respect except one-random
access speeds. PinPoint's "tools" use sequential algorithms.
All operate on entire tables. Thus, no matter how you
connect the tools together, you cannot create a
low-performance algorithm, regardless of the size of your
data set.
Ease of Use - Real-world data is dirty and
non-uniform, especially so-called "legacy data". The process
needed to accomplish a given data-processing task may vary,
depending on the data, so customization is the rule rather
than the exception. But most off-the-shelf data-processing
products are written with a specific process in mind, making
customization difficult or impossible.
With PinPoint, you build a custom data-processing project
out of simple tools, so you are not limited by anyone's
preconceived ideas of your data or process. You can change
much more than the parameters of a task-you can change the
fundamental algorithm defining the task.
Consider the merge-purge problem. Each data source may exist
in a different format and state of cleanliness. You may need
to apply a different cleaning and standardization process to
each input before starting the merge-purge process. Within
merge-purge, you might find that some of the data needs to
be treated differently (perhaps because some fields to be
compared are missing). You must perform a slightly different
process on the abnormal records, and then reconcile the
results with the normal records. With PinPoint, multiple
processing paths and alternate algorithms are trivial to
implement. PinPoint's friendly visual interface makes it
easy to split the data flow, route it through alternate
algorithms, and join the results at the end.
Cost Effectiveness - PinPoint is optimized to run on
inexpensive PC hardware running Windows NT. The street price
for a high-performance desktop computer with 256MB RAM and
72GB of disk storage is now less than $5,000. Such a system
offers 40MB/second of sustained disk throughput, which means
that the entire 72GB can be read or written in about 30
minutes. PinPoint can easily achieve throughput rates of 20
to 30 million records per hour on such a system, even on
complex jobs like merge-purge.
PinPoint Tools
PinPoint offers a comprehensive palette of database tools,
each of which performs a specific, simple job. The tools all
"plug" together-any input connects to any output. To build
your data-processing engine, simply connect and configure
the tools you need.
Hardware Requirements
PinPoint runs on Windows NT® and Windows® 95, 98, 2000.
Licensing PinPoint
PinPoint is available as single or multiple seat licenses.
Volume pricing is available. PinPoint can be licensed for
desktop to read and write flat files only, or can be
licensed to allow access to enterprise RDBMS sources.
Spatial tools are included, and geocoding, list count and
selection features are optional add-on modules. All PinPoint
software is available free of charge with business or
household data licensing commitments.