At some point in earlier times, mining was among the major, most-developing industries. Not surprisingly, history does repeat itself. But this time around, it's not the Au-kind of gold we're after and digging for.
One of the emerging and most powerful disruptive technologies in this day in age is termed data mining. As defined by Kurt Thearting in An Introduction To Data Mining, data mining is the extraction of hidden predictive information from large databases. In simpler words... reading between the mines.
Basically as the subject name indicates, this post is in accordance to our data mining class with Mr. Ramon Duremdes Jr.. Other insights on this are found on my co-data miners who's blog links are found left of this page.
Data mining aids companies focus on all the relevant information they can get hold of. Together with business intelligence, data mining tools, given databases of adequate size and quality, can create business opportunities by automating prediction of trends and behaviors and automating discovery of formerly unidentified patterns.
Data mining tools have the ability to answer business questions that traditionally were constrained with time. They quarry databases for veiled patterns, stumbling on upon information that perhaps human perception may overlook.
In order to survive in today's business ventures, many companies, aside from gathering and analyzing massive amounts of data, also see the need to shift and integrate data mining tools to their existing structures.
Data mining, along with artificial intelligence was listed by GartnerGroup Advanced Technology Note among the major technology areas that are and will keep on booming and in which companies of diverse industries will be investing in within the next couple of years. Well, with all the wonders of data mining, who wouldn't?
Companies nowadays are quite eager and enthusiastic of this rapdily-cultivating concept because of the fact that extensive processes of research and development has transformed data mining drastically as diverse techniques have surfaced along. Among these techniques, the most frequently used are artificial neural networks, decision trees, genetic algorithms, nearest neighbor method, rule induction, to name a few.
But before going into unfathomable concepts, how is it imaginable that data mining has the ability to tell us things we do not know or what would happen within the week, the month or year after, having such probabilistic and unstable economic conditions? Answer? Read on down.
The food chain begins with data and information, which of course are essential to creating business intelligence. Business intelligence, according to SAS, is the technology and practice of implementing information to make decisions. Business intelligence is considered to be obtained when information shows its true worth. And subsequent to this, not only a window, but an infinite number of doors are opened.
The cycle continues with knowledge workers utilizing the adequate Information Technology tools in order to define and analyze correlation and logic that lies behind the information. Examples of these tools are are databases, database management systems, data warehouses and data mining tools.
According to Haag, Phillip and Cummings on Management Information Systems for The Information Age, data mining tools are the software utilized to query information in a data warehouse supported by online analytical processing, which is the manipulation of information to support decision-making tasks.
Data mining tools include query-and-reporting tools, intelligent agents, multidimensional analysis tools and statistical tools.
- Query-and-reporting tools are similar to Query-By-Example tools, Structured-Query-Language and report generators in the typical database environment.
- Intelligent agents use several artificial intelligence tools as are neural networks and fuzzy logic in order to mold the basis of "information discovery" and building business intelligence.
- Multidimensional analysis tools are slick-and-dice techniques that permit the viewing of multidimensional information from different viewpoints.
- Statistical tools aid in applying various mathematical models to the information stored in a data warehouse to discover new information.
Data mining techniques, according to Kurt Thearling in An Overview of Data Mining Techniques, are classified into classical techniques and next generation techniques.
The classical techniques are the techniques used 99.9% of the time on existing business problems, mainly because industries rely on techniques that consistent, understandable and explainable. Among the classifcal techniques are the following:
- Statistics: statistical techniques have long been used and are not data mining techniques. Nonetheless, statistical techniques are dependable on data and are utilized to discover trends and build forecasting models. Being familiar or knowledgeable of how statistical techniques work and how they can be used is a big factor in this technique.
- Neighborhoods: among the first used techniques in data mining, it is a prediction technique that is similar to clustering. Its nature is that to forecast a prediction value in one record, it looks for records with similar predictor values in the historical database and utilizes the prediction value from the record that is closes to the unclassified record.
- Clustering: method through which similar records are grouped together in order to give a birds eye view of the business of what is happening within the database.
Trees, networks and rules are among the classified under the next generation techniques. These techniques are the most frequently used over the past two decades of research and can be used for discovering fresh information within large databases or for bulding predictive models.
- Trees: decision tree algorithms tend to automate the whole process of reating hypothesis and validate them in a more complete and integrated manner compared to any other data mining techniques. Trees are proficient in handling raw data with few or no pre-processing and have been applied for problems related to credit card attrition prediction to time series of exchange rate currencies.
- Networks: networks are computer programs that implement the detection of trend and pattern detection learning and machine learning algorithms to build predictive models from large historical databases.
- Rules: one of the key forms of data mining and perhaps may be of the most common for of knowledge discovery. Rule induction bears a resemblance to the process that people think of when they think about data mining, namely "mining", digging for gold through an immense database.
An industry that is better off or is ought to make use of data mining tools is the banking industry.
Companies want to comprehend the risks they are facing, they want to understand and relate to customers before their competitors do, they want to know what drives their costs, profits; they want to be in the know. And all these can and are done with the right information, the right tools, the right people... in short, data mining.
More and more companies are making use of this booming concept and with reasonable grounds: it can make significant and ample benefits. But what must be kept in mind is that there is a lot more to it than simply purchasing a toolkit.
Great blog! Hey! By the way,have you watched Ice Age 3?
ReplyDelete@jeff?: Yes. And I don't speak air. :P
ReplyDeleteWhat's your secret in making a good blog?
ReplyDeletecan you please teach me?
peace
@Rafael: Patience. And a wee bit of imagination. ;)
ReplyDelete