Organizing Big Data for Smart Manufacturing 227
This class is an overview of Big Data and how it is used in smart manufacturing. Smart factories capture large amounts of raw data, all of which must be stored and analyzed, from a variety of sources. Data, both unstructured and structured, is stored in repositories such as databases, data warehouses, and data lakes. Once data has context, it can be used for analytics and exchanged with multiple parties within an organization or supply chain through secure distribution channels and systems like distributed ledgers.
Big Data provides manufacturers with insights for improving operations and overall production. After completing this course, users will have a better understanding of the characteristics of Big Data and its purpose in smart manufacturing.
Number of Lessons 11
- Data in Manufacturing
- Big Data
- Data Storage: Database
- Data Storage: Data Warehouse
- Data Storage: Data Lake
- Big Data and Storage Review
- Data Contextualization
- Data Exchange Standards
- Distributed Ledgers and Blockchain
- Data Visualization and Analytics
- Final Review
- Describe data in manufacturing.
- Describe Big Data.
- Describe databases.
- Describe data warehouses.
- Describe data lakes.
- Describe how context is added to raw data in information models.
- Describe data exchange standards.
- Describe distributed ledgers and blockchain.
- Explain how data is used for visualization and analytics.
The main characteristics of Big Data. The 5 Vs of Big Data are volume, variety, velocity, veracity, and value.
A mathematical process designed to systematically solve a problem. Complex digital algorithms allow machine learning to analyze stored data.
AI. A computer program with algorithms that enable a machine or computer to imitate intelligent human behavior. Artificial intelligence allows machines to perform a process with autonomy.
A valuable collection of information from the devices or assets in an operation. Big Data can be analyzed to reveal patterns and make calculations.
A growing list of digital records, called blocks, that are linked. Blockchain technology is a decentralized, distributed ledger that records the origin of a digital asset.
Fixes or removes incorrect, corrupted, incorrectly formatted, duplicate, or incomplete data within a dataset. Cleansing the data creates a more reliable dataset.
Relating to remote servers accessed through the internet. Cloud-based data lakes store data remotely.
The practice of constantly measuring the effectiveness of processes and striving to meet more difficult objectives to satisfy customers. Continuous optimization is driven by data collection and analysis.
CPSs. A hardware device that links physical objects and processes with virtual objects and processes in an interconnected network. A cyber-physical system increases the capabilities of a device.
The process of identifying data relevant to an entity based on the entity’s context to make it more useful. Data contextualization allows users to better interpret data and use it to make decisions.
data exchange standards
A standard for sharing structured data across different information systems. Data exchange standards specify common formats and shared rules that lead to compatible data.
A computer application or system that manages massive amounts of raw data from a variety of sources. Data lakes can be located on premises or be cloud-based.
The process of analyzing data with software in order to draw out unknown patterns in the data. Data mining is used to discover useful information in large sets of unlabeled, unorganized data.
A concrete representation that organizes elements of data and standardizes how they relate to one another. Data models come in a variety of structures.
An individual who uses computing systems to perform data analysis in order to interpret and present insights gained from data. Data scientists can help manufacturers operate more efficiently by adjusting processes based on data analysis.
The representation of data through the use of common graphics, such as charts, plots, infographics, and even animations.
A computer application or system that manages massive amounts of data from a variety of sources. Data warehouses can be located on premises or be cloud-based.
Computer storage that holds data and is searchable. A database both stores and organizes information.
database management system
DBMS. Software designed to define, manipulate, retrieve, and manage data in a database. A database management system is necessary for users to perform operations with the data.
A single asset or unit that consists of similar types of collected information. Datasets are used by manufacturers to organize the information they collect.
The allocation of resources, both hardware and software, to each individual workstation or office location. Decentralized databases are components of a smart manufacturing network.
A group of systems and devices connected to one another through wired or wireless connections. Digital networks allow data to transfer electronically from one location to another.
A unique identifier that validates the authenticity of a document or file. Digital signatures are used in blockchain to prevent fraud.
A consensus of replicated, shared, and synchronized digital data geographically spread across multiple sites, countries, or institutions. In a distributed ledger, there is no central administrator or centralized data storage.
The buying and selling of goods and services, or the transmitting of funds or data, over an electronic network. E-commerce is an important aspect of smart manufacturing production.
The person who uses the physical part created by a manufacturer. End users interact with data using software applications.
enterprise resource planning
ERP. A management information system that takes data from multiple internal and external information sources to integrate business processes throughout the organization. Enterprise resource planning is often integrated into the product lifecycle management, operations management, and supply chain management systems via the digital thread.
A type of data model in which data is represented as entities with attributes and their relationship to other entities. Entity-relationship models are used as a starting point for a database.
Extract, Transform, and Load
ETL. A data integration process that combines data from multiple data sources into a single, consistent dataset that is loaded into a data warehouse. Extract, Transform, and Load is necessary for organizing data.
A type of data model in which data is represented in the form of a graph. Graph models, also known as network models, show relationships among nodes.
A type of data model in which data is organized into a tree-like structure that has a single root node with data linked to it. The hierarchical model contains a one-to-many relationship between types of data.
Industrial Internet of Things. A network of physical devices used in manufacturing that contain computing systems that allow them to send and receive data. The IIoT allows devices to exchange data and automate processes without any human intervention.
Industrial Internet of Things
IIoT. A network of physical devices used in manufacturing that contain computing systems that allow them to send and receive data. The Industrial Internet of Things allows devices to exchange data and automate processes without any human intervention.
An abstract representation of entities. Information models are the basis for data models.
The collection of hardware and software used to collect, store, and process data. Information systems are typically housed by the information technology (IT) department of an organization.
The ability of different computerized devices or systems to readily connect and exchange information with one another. Interoperability is key to IIoT technologies.
A systematic approach to sourcing, storing, and selling inventory. Inventory management keeps raw materials and products adequately stocked.
An attribute or group of attributes used to uniquely identify data. Keys are often the columns in a table of a relational database.
A channel in a data model that connects two or more nodes. Links can only connect related data nodes.
ML. The process that enables a digital system to analyze data in order to build predictive models and make decisions autonomously. Machine learning systematically solves problems using highly complex algorithms.
A data visualization tool that contains current and updated information related to performance and operations throughout the supply chain. Real-time manufacturing dashboards are used in smart factories to fully utilize data.
manufacturing execution systems
MESs. A software application that monitors and analyzes machine performance data related to processing materials and producing finished products. Manufacturing execution systems can be used to integrate performance data from multiple machines in a product’s lifecycle.
Manufacturing Technology Connect
MTConnect. A communication standard that works as a universal translator between computer software and thousands of different machine and device languages. Manufacturing Technology Connect is open source and can be easily modified or adapted to a specific task.
Message Queue Telemetry Transport
MQTT. A protocol that provides a scalable and reliable way to connect devices over the Internet. Message Queue Telemetry Transport allows messages to be sent between devices.
machine learning. The process that enables a digital system to analyze data in order to build predictive models and make decisions. ML systematically solves problems using highly complex algorithms.
A type of data model in which data is represented in the form of a graph. Network models, also known as graph models, show relationships among nodes.
A single unit of data in a data model. Nodes are typically related to one another.
Open Platform Communications United Architecture
OPC UA. A collection of communication protocols and standards used with industrial equipment. Open Platform Communications United Architecture works as a universal translator, allowing computer software to communicate easily with thousands of different manufacturing sensors, CNC machines, and other compatible devices.
original equipment manufacturers
OEMs. An organization that provides parts and equipment to the supply chain. Original equipment manufacturers supply parts that are used in finished products.
overall equipment effectiveness
OEE. A measure of how effectively equipment is being used. High overall equipment effectiveness leads to a more efficient production line and throughput.
A maintenance approach that involves collecting data related to machine operation in order to service a machine before maintenance issues arise. A predictive maintenance approach involves performing maintenance before failures occur to prevent unscheduled downtime and reduce the labor, cost, and time needed for repairs.
A measurable characteristic. Process variables can include velocity, position, temperature, or any other process-related condition.
The series of stages a product goes through from conception to the end of its useful life. The product lifecycle includes design, production, distribution, and end-use by the customer.
product lifecycle management
PLM. A computer software program that traces all information related to the manufacture of a product from prototype to failure. Product lifecycle management software uses the digital thread to integrate all cyber-physical data related to the manufacture of a part and beyond.
A set of standards and rules used by network devices to interact with each other. Protocols are, essentially, the language that networked devices use to communicate.
QC. A system of managing quality by inspecting finished products to make sure they meet specifications. Quality control relies on error detection and correction.
A request for information. Users can make a query to retrieve data or change information in a database.
A type of database in which the data items have pre-defined relationships between them. A relational database both stores and organizes information into searchable tables.
A type of data model in which data tables are used to collect groups of data elements into relations. Relational models are used for modeling databases.
The first unit of data in a data model. Data nodes branch off of the root node.
The length of time a program takes to execute. Runtime data models are a common practice in data storage.
The information-driven, event-driven, efficient, and collaborative orchestration of business, physical, and digital processes within plants, factories, and across the entire value chain. Smart manufacturing increases connectivity between all areas of product development.
A device equipped with software that can detect physical changes in the environment and process them as digital signals. Smart sensors are more advanced than normal digital sensors since they use their own computing software to process data rather than sending the data to an external system to be processed.
A computer application that stores data in tables. Spreadsheet programs are a type of database.
Information that is highly organized and follows a data model. Structured data may appear as datasets that can easily be stored in a database.
structured query language
SQL. A programming language that is used to manage relational databases and perform various operations on the data. Structured query language is used by most databases.
A complex network of companies and suppliers that produce and distribute a product. A supply chain consists of a company, its suppliers, its distributors, and its customers.
The period of usage until product performance fails. Time-to-failure is predicted using collected data in predictive maintenance.
The ability to verify the history, application, or location of an item using data flows. Digital threads enable traceability throughout the entire production lifecycle.
User interface. A device that allows a user to control a software application or hardware device. UIs vary according to manufacturer, but they generally include a display screen and controls.
Information in its original format. Unstructured data is raw data in the form of files, images, and text.
UI. A device that allows a user to control a software application or hardware device. User interfaces vary according to manufacturer, but they generally include a display screen and controls.
In the 5 Vs of Big Data, the quality of the data. Value is typically discovered by data analytics and is most important to an organization.
In the 5 Vs of Big Data, the type of data. The variety of Big Data should be diverse and come from many different sources.
In the 5 Vs of Big Data, the speed at which data is generated. The velocity of Big Data generation is very fast.
In the 5 Vs of Big Data, the reliability of the data. The veracity of Big Data amounts to the accuracy of data and information.
In the 5 Vs of Big Data, the quantity of generated and stored data. Big Data is generated and stored at very high volumes.
A computer application that enables the inputting, editing, formatting, and outputting of text. Word-processing software gives manufacturers the ability to create reports and document information.