IBM Spectrum Discover: Metadata Management for Deep Insight of Unstructured Storage

IBM Spectrum Discover: Metadata Management for Deep Insight of Unstructured Storage
Author :
Publisher : IBM Redbooks
Total Pages : 152
Release :
ISBN-10 : 9780738457864
ISBN-13 : 0738457868
Rating : 4/5 (64 Downloads)

Synopsis IBM Spectrum Discover: Metadata Management for Deep Insight of Unstructured Storage by : Joseph Dain

This IBM® Redpaper publication provides a comprehensive overview of the IBM Spectrum® Discover metadata management software platform. We give a detailed explanation of how the product creates, collects, and analyzes metadata. Several in-depth use cases are used that show examples of analytics, governance, and optimization. We also provide step-by-step information to install and set up the IBM Spectrum Discover trial environment. More than 80% of all data that is collected by organizations is not in a standard relational database. Instead, it is trapped in unstructured documents, social media posts, machine logs, and so on. Many organizations face significant challenges to manage this deluge of unstructured data such as: Pinpointing and activating relevant data for large-scale analytics Lacking the fine-grained visibility that is needed to map data to business priorities Removing redundant, obsolete, and trivial (ROT) data Identifying and classifying sensitive data IBM Spectrum Discover is a modern metadata management software that provides data insight for petabyte-scale file and Object Storage, storage on premises, and in the cloud. This software enables organizations to make better business decisions and gain and maintain a competitive advantage. IBM Spectrum Discover provides a rich metadata layer that enables storage administrators, data stewards, and data scientists to efficiently manage, classify, and gain insights from massive amounts of unstructured data. It improves storage economics, helps mitigate risk, and accelerates large-scale analytics to create competitive advantage and speed critical research.

Cataloging Unstructured Data in IBM Watson Knowledge Catalog with IBM Spectrum Discover

Cataloging Unstructured Data in IBM Watson Knowledge Catalog with IBM Spectrum Discover
Author :
Publisher : IBM Redbooks
Total Pages : 108
Release :
ISBN-10 : 9780738459028
ISBN-13 : 073845902X
Rating : 4/5 (28 Downloads)

Synopsis Cataloging Unstructured Data in IBM Watson Knowledge Catalog with IBM Spectrum Discover by : Joseph Dain

This IBM® Redpaper publication explains how IBM Spectrum® Discover integrates with the IBM Watson® Knowledge Catalog (WKC) component of IBM Cloud® Pak for Data (IBM CP4D) to make the enriched catalog content in IBM Spectrum Discover along with the associated data available in WKC and IBM CP4D. From an end-to-end IBM solution point of view, IBM CP4D and WKC provide state-of-the-art data governance, collaboration, and artificial intelligence (AI) and analytics tools, and IBM Spectrum Discover complements these features by adding support for unstructured data on large-scale file and object storage systems on premises and in the cloud. Many organizations face challenges to manage unstructured data. Some challenges that companies face include: Pinpointing and activating relevant data for large-scale analytics, machine learning (ML) and deep learning (DL) workloads. Lacking the fine-grained visibility that is needed to map data to business priorities. Removing redundant, obsolete, and trivial (ROT) data and identifying data that can be moved to a lower-cost storage tier. Identifying and classifying sensitive data as it relates to various compliance mandates, such as the General Data Privacy Regulation (GDPR), Payment Card Industry Data Security Standards (PCI-DSS), and the Health Information Portability and Accountability Act (HIPAA). This paper describes how IBM Spectrum Discover provides seamless integration of data in IBM Storage with IBM Watson Knowledge Catalog (WKC). Features include: Event-based cataloging and tagging of unstructured data across the enterprise. Automatically inspecting and classifying over 1000 unstructured data types, including genomics and imaging specific file formats. Automatically registering assets with WKC based on IBM Spectrum Discover search and filter criteria, and by using assets in IBM CP4D. Enforcing data governance policies in WKC in IBM CP4D based on insights from IBM Spectrum Discover, and using assets in IBM CP4D. Several in-depth use cases are used that show examples of healthcare, life sciences, and financial services. IBM Spectrum Discover integration with WKC enables storage administrators, data stewards, and data scientists to efficiently manage, classify, and gain insights from massive amounts of data. The integration improves storage economics, helps mitigate risk, and accelerates large-scale analytics to create competitive advantage and speed critical research.

IBM Reference Architecture for High Performance Data and AI in Healthcare and Life Sciences

IBM Reference Architecture for High Performance Data and AI in Healthcare and Life Sciences
Author :
Publisher : IBM Redbooks
Total Pages : 88
Release :
ISBN-10 : 9780738456904
ISBN-13 : 073845690X
Rating : 4/5 (04 Downloads)

Synopsis IBM Reference Architecture for High Performance Data and AI in Healthcare and Life Sciences by : Dino Quintero

This IBM® Redpaper publication provides an update to the original description of IBM Reference Architecture for Genomics. This paper expands the reference architecture to cover all of the major vertical areas of healthcare and life sciences industries, such as genomics, imaging, and clinical and translational research. The architecture was renamed IBM Reference Architecture for High Performance Data and AI in Healthcare and Life Sciences to reflect the fact that it incorporates key building blocks for high-performance computing (HPC) and software-defined storage, and that it supports an expanding infrastructure of leading industry partners, platforms, and frameworks. The reference architecture defines a highly flexible, scalable, and cost-effective platform for accessing, managing, storing, sharing, integrating, and analyzing big data, which can be deployed on-premises, in the cloud, or as a hybrid of the two. IT organizations can use the reference architecture as a high-level guide for overcoming data management challenges and processing bottlenecks that are frequently encountered in personalized healthcare initiatives, and in compute-intensive and data-intensive biomedical workloads. This reference architecture also provides a framework and context for modern healthcare and life sciences institutions to adopt cutting-edge technologies, such as cognitive life sciences solutions, machine learning and deep learning, Spark for analytics, and cloud computing. To illustrate these points, this paper includes case studies describing how clients and IBM Business Partners alike used the reference architecture in the deployments of demanding infrastructures for precision medicine. This publication targets technical professionals (consultants, technical support staff, IT Architects, and IT Specialists) who are responsible for providing life sciences solutions and support.

HIPAA Compliance for Healthcare Workloads on IBM Spectrum Scale

HIPAA Compliance for Healthcare Workloads on IBM Spectrum Scale
Author :
Publisher : IBM Redbooks
Total Pages : 18
Release :
ISBN-10 : 9780738458601
ISBN-13 : 0738458600
Rating : 4/5 (01 Downloads)

Synopsis HIPAA Compliance for Healthcare Workloads on IBM Spectrum Scale by : Sandeep R. Patil

When technology workloads process healthcare data, it is important to understand Health Insurance Portability and Accountability Act (HIPAA) compliance and what it means for the technology infrastructure in general and storage in particular. HIPAA is US legislation that was signed into law in 1996. HIPAA was enacted to protect health insurance coverage, but was later extended to ensure protection and privacy of electronic health records and transactions. In simple terms, it was instituted to modernize the exchange of healthcare information and how the Personally Identifiable Information (PII) that is maintained by the healthcare and healthcare-related industries are safeguarded. From a technology perspective, one of the core requirements of HIPAA is the protection of Electronic Protected Health Information (ePHIPer through physical, technical, and administrative defenses. From a non-compliance perspective, the Health Information Technology for Economic and Clinical Health Act (HITECH) added protections to HIPAA and increased penalties $100 USD - $50,000 USD per violation. Today, HIPAA-compliant solutions are a norm in the healthcare industry worldwide. This IBM® Redpaper publication describes HIPPA compliance requirements for storage and how security enhanced software-defined storage is designed to help meet those requirements. We correlate how Software Defined IBM Spectrum® Scale security features address the safeguards that are specified by the HIPAA Security Rule.

IBM Power Systems Enterprise AI Solutions

IBM Power Systems Enterprise AI Solutions
Author :
Publisher : IBM Redbooks
Total Pages : 64
Release :
ISBN-10 : 9780738458052
ISBN-13 : 0738458058
Rating : 4/5 (52 Downloads)

Synopsis IBM Power Systems Enterprise AI Solutions by : Scott Vetter

This IBM® Redpaper publication helps the line of business (LOB), data science, and information technology (IT) teams develop an information architecture (IA) for their enterprise artificial intelligence (AI) environment. It describes the challenges that are faced by the three roles when creating and deploying enterprise AI solutions, and how they can collaborate for best results. This publication also highlights the capabilities of the IBM Cognitive Systems and AI solutions: IBM Watson® Machine Learning Community Edition IBM Watson Machine Learning Accelerator (WMLA) IBM PowerAI Vision IBM Watson Machine Learning IBM Watson Studio Local IBM Video Analytics H2O Driverless AI IBM Spectrum® Scale IBM Spectrum Discover This publication examines the challenges through five different use case examples: Artificial vision Natural language processing (NLP) Planning for the future Machine learning (ML) AI teaming and collaboration This publication targets readers from LOBs, data science teams, and IT departments, and anyone that is interested in understanding how to build an IA to support enterprise AI development and deployment.

IBM Cloud Object Storage System Product Guide

IBM Cloud Object Storage System Product Guide
Author :
Publisher : IBM Redbooks
Total Pages : 214
Release :
ISBN-10 : 9780738460130
ISBN-13 : 0738460133
Rating : 4/5 (30 Downloads)

Synopsis IBM Cloud Object Storage System Product Guide by : Vasfi Gucer

Object storage is the primary storage solution that is used in the cloud and on-premises solutions as a central storage platform for unstructured data. IBM Cloud Object Storage is a software-defined storage (SDS) platform that breaks down barriers for storing massive amounts of data by optimizing the placement of data on commodity x86 servers across the enterprise. This IBM Redbooks® publication describes the major features, use case scenarios, deployment options, configuration details, initial customization, performance, and scalability considerations of IBM Cloud Object Storage on-premises offering. For more information about the IBM Cloud Object Storage architecture and technology that is behind the product, see IBM Cloud Object Storage Concepts and Architecture , REDP-5537. The target audience for this publication is IBM Cloud Object Storage IT specialists and storage administrators.

IBM Software-Defined Storage Guide

IBM Software-Defined Storage Guide
Author :
Publisher : IBM Redbooks
Total Pages : 158
Release :
ISBN-10 : 9780738457055
ISBN-13 : 0738457051
Rating : 4/5 (55 Downloads)

Synopsis IBM Software-Defined Storage Guide by : Larry Coyne

Today, new business models in the marketplace coexist with traditional ones and their well-established IT architectures. They generate new business needs and new IT requirements that can only be satisfied by new service models and new technological approaches. These changes are reshaping traditional IT concepts. Cloud in its three main variants (Public, Hybrid, and Private) represents the major and most viable answer to those IT requirements, and software-defined infrastructure (SDI) is its major technological enabler. IBM® technology, with its rich and complete set of storage hardware and software products, supports SDI both in an open standard framework and in other vendors' environments. IBM services are able to deliver solutions to the customers with their extensive knowledge of the topic and the experiences gained in partnership with clients. This IBM RedpaperTM publication focuses on software-defined storage (SDS) and IBM Storage Systems product offerings for software-defined environments (SDEs). It also provides use case examples across various industries that cover different client needs, proposed solutions, and results. This paper can help you to understand current organizational capabilities and challenges, and to identify specific business objectives to be achieved by implementing an SDS solution in your enterprise.

IBM Watson Content Analytics: Discovering Actionable Insight from Your Content

IBM Watson Content Analytics: Discovering Actionable Insight from Your Content
Author :
Publisher : IBM Redbooks
Total Pages : 598
Release :
ISBN-10 : 9780738439426
ISBN-13 : 0738439428
Rating : 4/5 (26 Downloads)

Synopsis IBM Watson Content Analytics: Discovering Actionable Insight from Your Content by : Wei-Dong (Jackie) Zhu

IBM® WatsonTM Content Analytics (Content Analytics) Version 3.0 (formerly known as IBM Content Analytics with Enterprise Search (ICAwES)) helps you to unlock the value of unstructured content to gain new actionable business insight and provides the enterprise search capability all in one product. Content Analytics comes with a set of tools and a robust user interface to empower you to better identify new revenue opportunities, improve customer satisfaction, detect problems early, and improve products, services, and offerings. To help you gain the most benefits from your unstructured content, this IBM Redbooks® publication provides in-depth information about the features and capabilities of Content Analytics, how the content analytics works, and how to perform effective and efficient content analytics on your content to discover actionable business insights. This book covers key concepts in content analytics, such as facets, frequency, deviation, correlation, trend, and sentimental analysis. It describes the content analytics miner, and guides you on performing content analytics using views, dictionary lookup, and customization. The book also covers using IBM Content Analytics Studio for domain-specific content analytics, integrating with IBM Content Classification to get categories and new metadata, and interfacing with IBM Cognos® Business Intelligence (BI) to add values in BI reporting and analysis, and customizing the content analytics miner with APIs. In addition, the book describes how to use the enterprise search capability for the discovery and retrieval of documents using various query and visual navigation techniques, and customization of crawling, parsing, indexing, and runtime search to improve search results. The target audience of this book is decision makers, business users, and IT architects and specialists who want to understand and analyze their enterprise content to improve and enhance their business operations. It is also intended as a technical how-to guide for use with the online IBM Knowledge Center for configuring and performing content analytics and enterprise search with Content Analytics.

IBM Data Engine for Hadoop and Spark

IBM Data Engine for Hadoop and Spark
Author :
Publisher : IBM Redbooks
Total Pages : 126
Release :
ISBN-10 : 9780738441931
ISBN-13 : 0738441937
Rating : 4/5 (31 Downloads)

Synopsis IBM Data Engine for Hadoop and Spark by : Dino Quintero

This IBM® Redbooks® publication provides topics to help the technical community take advantage of the resilience, scalability, and performance of the IBM Power SystemsTM platform to implement or integrate an IBM Data Engine for Hadoop and Spark solution for analytics solutions to access, manage, and analyze data sets to improve business outcomes. This book documents topics to demonstrate and take advantage of the analytics strengths of the IBM POWER8® platform, the IBM analytics software portfolio, and selected third-party tools to help solve customer's data analytic workload requirements. This book describes how to plan, prepare, install, integrate, manage, and show how to use the IBM Data Engine for Hadoop and Spark solution to run analytic workloads on IBM POWER8. In addition, this publication delivers documentation to complement available IBM analytics solutions to help your data analytic needs. This publication strengthens the position of IBM analytics and big data solutions with a well-defined and documented deployment model within an IBM POWER8 virtualized environment so that customers have a planned foundation for security, scaling, capacity, resilience, and optimization for analytics workloads. This book is targeted at technical professionals (analytics consultants, technical support staff, IT Architects, and IT Specialists) that are responsible for delivering analytics solutions and support on IBM Power Systems.

Systems of Insight for Digital Transformation: Using IBM Operational Decision Manager Advanced and Predictive Analytics

Systems of Insight for Digital Transformation: Using IBM Operational Decision Manager Advanced and Predictive Analytics
Author :
Publisher : IBM Redbooks
Total Pages : 266
Release :
ISBN-10 : 9780738441184
ISBN-13 : 073844118X
Rating : 4/5 (84 Downloads)

Synopsis Systems of Insight for Digital Transformation: Using IBM Operational Decision Manager Advanced and Predictive Analytics by : Whei-Jen Chen

Systems of record (SORs) are engines that generates value for your business. Systems of engagement (SOE) are always evolving and generating new customer-centric experiences and new opportunities to capitalize on the value in the systems of record. The highest value is gained when systems of record and systems of engagement are brought together to deliver insight. Systems of insight (SOI) monitor and analyze what is going on with various behaviors in the systems of engagement and information being stored or transacted in the systems of record. SOIs seek new opportunities, risks, and operational behavior that needs to be reported or have action taken to optimize business outcomes. Systems of insight are at the core of the Digital Experience, which tries to derive insights from the enormous amount of data generated by automated processes and customer interactions. Systems of Insight can also provide the ability to apply analytics and rules to real-time data as it flows within, throughout, and beyond the enterprise (applications, databases, mobile, social, Internet of Things) to gain the wanted insight. Deriving this insight is a key step toward being able to make the best decisions and take the most appropriate actions. Examples of such actions are to improve the number of satisfied clients, identify clients at risk of leaving and incentivize them to stay loyal, identify patterns of risk or fraudulent behavior and take action to minimize it as early as possible, and detect patterns of behavior in operational systems and transportation that lead to failures, delays, and maintenance and take early action to minimize risks and costs. IBM® Operational Decision Manager is a decision management platform that provides capabilities that support both event-driven insight patterns, and business-rule-driven scenarios. It also can easily be used in combination with other IBM Analytics solutions, as the detailed examples will show. IBM Operational Decision Manager Advanced, along with complementary IBM software offerings that also provide capability for systems of insight, provides a way to deliver the greatest value to your customers and your business. IBM Operational Decision Manager Advanced brings together data from different sources to recognize meaningful trends and patterns. It empowers business users to define, manage, and automate repeatable operational decisions. As a result, organizations can create and shape customer-centric business moments. This IBM Redbooks® publication explains the key concepts of systems of insight and how to implement a system of insight solution with examples. It is intended for IT architects and professionals who are responsible for implementing a systems of insights solution requiring event-based context pattern detection and deterministic decision services to enhance other analytics solution components with IBM Operational Decision Manager Advanced.