Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.
Data mining is a particular data analysis technique that focuses on modeling and knowledge discovery for predictive rather than purely descriptive purposes. Business intelligence covers data analysis that relies heavily on aggregation, focusing on business information.
The human species’ use of technology began with the conversion of natural resources into simple tools. The prehistorical discovery of the ability to control fire increased the available sources of food and the invention of the wheel helped humans in travelling in and controlling their environment. Recent technological developments, including the printing press, the telephone, and the Internet, have lessened physical barriers to communication and allowed humans to interact freely on a global scale. However, not all technology has been used for peaceful purposes; the development of weapons of ever-increasing destructive power has progressed throughout history, from clubs to nuclear weapons.
In statistical applications, some people divide data analysis into descriptive statistics, exploratory data analysis (EDA), and confirmatory data analysis (CDA). EDA focuses on discovering new features in the data and CDA on confirming or falsifying existing hypotheses. Predictive analytics focuses on application of statistical or structural models for predictive forecasting or classification, while text analytics applies statistical, linguistic, and structural techniques to extract and classify information from textual sources, a species of unstructured data. All are varieties of data analysis.
Data integration is a precursor to data analysis, and data analysis is closely linked to data visualization and data dissemination. The term data analysis is sometimes used as a synonym for data modeling.
Using technology to compute and analyse data has taken another dimension whereby sophisticated software can now be used to access a whole series of information online. Technology offers a potential medium through which data computing,analyzing and implementation could be made easier and more likely to occur (Ysseldyke & McLeod, 2007).
The use of technology makes ongoing data collection, data consumption, and data-based decision making a more plausible proposition, and it can keep these important aspects of analysing from monopolizing teacher time. Previous research found that the use of technology substantially facilitated collecting, managing, and analyzing educational data (McIntire, 2002; McLeod, 2005; Pierce, 2005; Wayman, 2005). Thus, technology-enhanced assessment (TEA) would likely support data analyses, but applying technology to other aspects of data computing which would likely enhance the implementation of those components as well.
However, technology could help facilitate implementation of an effective Tier 1 through various applications and by assessing the quality of the core instruction. Version 3 of the Ecobehavioral Assessment System Software (EBASS) is based on the research of Greenwood, Carta, and Atwater (1991) and provides a technology-enhanced assessment of the instructional environment. EBASS is a software system that school personnel can use to conduct systematic classroom observational assessments with laptop, notebook, or hand-held computers.
Many organizations have large amounts of data which has been collected and stored in massive datasets which needs be processed and analyzed to provide business intelligence, improve products and services for customers, or to meet other internal data processing requirements by which technological processes where adopted. For example, Internet companies need to process data collected by Web crawlers as well as logs, click data, and other information generated by Web services. Parallel relational database technology has not proven to be cost-effective or provide the high-performance needed to analyze massive amounts of data in a timely manner As a result several organizations developed technology to utilize large clusters of commodity servers to provide high-performance computing capabilities for processing and analysis of massive datasets. Clusters can consist of hundreds or even thousands of commodity machines connected using high-bandwidth networks. Examples of this type of cluster technology include Google’s MapReduce, Apache Hadoop, Aster Data Systems, Sector/Sphere, and LexisNexis HPCC platform.
High Performance Computing
High-Performance Computing (HPC) is another form of technology that is used to describe computing environments which utilize supercomputers and computer clusters to address complex computational requirements, support applications with significant processing time requirements, or require processing of significant amounts of data. Supercomputers have generally been associated with scientific research and compute-intensive types of problems, but more and more supercomputer technology is appropriate for both compute-intensive and data-intensive applications. A new trend in supercomputer design for high-performance computing is using clusters of independent processors connected in parallel. Many computing problems are suitable for parallelization, often problems can be divided in a manner so that each independent processing node can work on a portion of the problem in parallel by simply dividing the data to be processed, and then combining the final processing results for each portion. This type of parallelism is often referred to as data-parallelism, and data-parallel applications are a potential solution to petabyte scale data processing requirements. Data-parallelism can be defined as a computation applied independently to each data item of a set of data which allows the degree of parallelism to be scaled with the volume of data. The most important reason for developing data-parallel applications is the potential for scalable performance in high-performance computing, and may result in several orders of magnitude performance improvement.
Technology has affected society and its surroundings in a number of ways. In many societies, technology has helped develop more advanced economies (including today’s global economy) and has allowed the rise of a leisure class. Many technological processes produce unwanted by-products, known as pollution, and deplete natural resources, to the detriment of the Earth and its environment. Various implementations of technology influence the values of a society and new technology often raises new ethical questions. Examples include the rise of the notion of efficiency in terms of human productivity, a term originally applied only to machines, and the challenge of traditional norms.