データ
原題: Data
分析結果
- カテゴリ
- AI
- 重要度
- 60
- トレンドスコア
- 24
- 要約
- データとは、人間や機械によるコミュニケーション、解釈、または処理に適した形で表現された事実、概念、または指示を指します。
- キーワード
Data — Grokipedia Fact-checked by Grok 2 months ago Data Ara Eve Leo Sal 1x Data represents facts , concepts , or instructions in forms suitable for communication, interpretation, or processing by humans or machines. [1] In computing, it underpins storage, analysis, and decision-making, typically raw before processing into information. [2] Data is categorized by structure and nature . Structured data follows predefined formats, such as database rows and columns, allowing efficient querying via tools like SQL. [3] Unstructured data , by contrast, lacks a fixed schema and includes emails, images, videos, and social media posts—most contemporary data—demanding advanced methods like natural language processing for extraction. [4] Data further divides into quantitative (numerical values for measurement and statistics, e.g., sales or temperatures) or qualitative (non-numerical descriptions, e.g., feedback or responses). [5] Data's role has expanded rapidly, powering advances in science , business , and governance through insights. [6] Organizations apply data analytics to refine decision-making , operations, and predictions, enhancing efficiency and innovation . [7] In research , digital data facilitates modeling and discovery, from genomics to climate studies. [8] This growth, however, introduces issues like privacy , data quality , and ethical concerns, requiring stringent standards and regulations. [9] Fundamentals Etymology and Terminology The word "data" originates from the Latin datum , the neuter past participle of dare meaning "to give," thus translating to "something given" or "a thing granted." As the plural form data , it entered English in the mid-17th century, with the Oxford English Dictionary recording its earliest evidence in 1645 in the writings of Scottish author and translator Thomas Urquhart , where it referred to facts or propositions given as a basis for reasoning or calculation in scientific and mathematical contexts. [10] [11] Initially borrowed directly from Latin scientific texts, the term appeared in English via scholarly works emphasizing empirical observations and computations. A historical milestone in the application of data occurred in 1662 with John Graunt's Natural and Political Observations Made upon the Bills of Mortality , which analyzed London parish records to derive demographic patterns, representing one of the earliest systematic uses of aggregated numerical data in what is now recognized as descriptive statistics , even though Graunt himself did not employ the specific term "data." The concept gained further traction in scientific discourse throughout the 17th and 18th centuries. By the 1950s , "data" was widely adopted in computing, notably by IBM in naming its systems, such as the 1953 IBM 701 Electronic Data Processing Machine, which processed large volumes of numerical information for business and scientific purposes, solidifying the term's role in technological contexts. [12] [13] In the 20th century , particularly with the expansion of computing , the usage of "data" evolved from its traditional plural form—taking verbs like "are"—to a mass noun treated as singular, as in "data is," reflecting its conceptualization as an undifferentiated collection rather than discrete items; Google Books Ngram analysis shows the singular form rising from a minority in the early 1900s to parity with the plural by the late 20th century . [14] Key terminological distinctions include raw data , defined as unprocessed facts, figures, or symbols without inherent meaning or context, versus information , which arises when raw data is organized, processed, and interpreted to convey significance, as outlined in standards like the U.S. Department of Defense's data management framework. [15] Modern style guides address the singular/ plural debate: the American Psychological Association (APA) recommends plural treatment ("data are") in formal and scientific writing for precision, while the Chicago Manual of Style permits either, favoring singular for general audiences but plural in technical contexts to honor the word's Latin roots. [16] [17] Definitions and Meanings Data is defined as the representation of facts, concepts, or instructions in a formalized manner suitable for communication, interpretation, or processing by humans or automated systems. This encompasses numerical values, textual descriptions, symbolic notations, or other discrete units that capture observations or measurements without inherent context or significance on their own. For instance, raw sensor readings from a thermometer recording temperatures at specific intervals exemplify data as unprocessed inputs awaiting analysis . A key distinction lies between data and related concepts like information , where data serves as the raw, unstructured foundation, while information emerges from its organization , contextualization, and interpretation to convey meaning. This relationship is formalized in the DIKW hierarchy, which progresses from data (basic symbols or signals) to information (processed and related facts), knowledge (applied understanding through patterns and rules), and wisdom (evaluative judgment for decision-making ). The hierarchy, introduced by Russell L. Ackoff in 1989, underscores that data alone lacks meaning until transformed, as seen in examples like isolated numbers from a database becoming meaningful sales trends when aggregated and analyzed. In philosophical contexts, data refers to empirical observations or sense-data that form the basis of perceptual experience and epistemic justification, distinct from interpretive thought. These are immediate sensory impressions, such as visual or auditory inputs, that philosophers like Bertrand Russell analyzed as mind-independent entities grounding knowledge claims. In legal settings, data functions as evidentiary facts—recorded information or predicate details that support inferences in judicial proceedings, such as digital logs or witness statements admissible under rules like Federal Rule of Evidence 703. Everyday usage treats data as personal records, including health metrics, financial transactions, or location histories, which individuals manage for practical purposes like budgeting or fitness tracking. Since the early 2000s, the meaning of data has evolved to incorporate digital traces of user behavior, driven by the rise of Web 2.0 platforms and big data analytics, where unstructured logs from social interactions and online activities are treated as valuable raw inputs for predictive modeling. This shift, exemplified by the growth of user-generated content on sites like early social media , expanded data's scope beyond traditional records to encompass behavioral patterns analyzed for targeted advertising and personalization . Types of Data Data can be broadly classified into qualitative and quantitative types based on its nature and measurability. Qualitative data, also known as categorical data, consists of non-numerical information that describes qualities, characteristics, or attributes, such as text, images, audio, or observations that capture themes, patterns, or meanings without assigning numerical values. [18] In contrast, quantitative data is numerical and measurable, allowing for mathematical operations like counting, averaging, or statistical analysis ; it includes values such as heights, temperatures, or sales figures that represent quantities or amounts. [19] This distinction is fundamental in research and analysis , where qualitative data provides depth and context , while quantitative data enables precision and generalizability. [20] Another key categorization distinguishes structured from unstructured data based on organization and format. Structured data is highly organized and stored in a predefined format, such as rows and columns in relational databases or spreadsheets, making it easily searchable, analyzable, and integrable with tools like SQL; examples include customer records in a CRM system or sensor readings in fixed schemas. [21] Unstructured data , comprising about 80-90% of all data generated today, lacks a predetermined structure and includes free-form content like emails, social media posts, videos, or documents that require advanced processing techniques for extraction and interpretation. [21] This divide impacts storage, processing efficiency, and application, with structured data suiting traditional analytics and unstructured data fueling modern AI-driven insights. [22] Additional classifications refine these categories further. Discrete data consists of distinct, countable values with no intermediate points, such as the number of items sold (integers) or categories like gender, which can only take specific, separated states. [23] Continuous data, however, forms a spectrum of infinite possible values within a range, measurable to any degree of precision, as in weight, time, or temperature , often represented by real numbers. [23] Separately, primary data is original information collected firsthand by the researcher for a specific purpose, through methods like surveys or experiments, ensuring direct relevance but requiring more resources. [24] Secondary data , derived from existing sources compiled by others, such as published reports or databases, offers broader scope and cost savings but may introduce biases or outdated elements. [24] Emerging types of data reflect evolving technological and analytical needs. Big data is characterized by the "three Vs"—high volume (massive scale of data generation), velocity (rapid speed of data creation and processing), and variety (diverse formats from structured to unstructured sources)—demanding innovative handling beyond traditional systems, as defined by Gartner in 2011. Metadata, or "data about data," provides descriptive context for other data, including details like creation date, author, format, or location, standardized by ISO/IEC 11179 to facilitate interoperability and management ac