As you probably all know I worked on ML and NLP for one of the Microsoft knowledge graphs with 2 billion entities. Original work accomplished on data mining and extraction with natural language processing. Knowledge graph we built has about 10 TB of data and API's available for integration with Skype, Bing.com, Bing Ads, Dynamics CRM, Windows Cortana, Word, MSN, and Excel
Graphs are huge semantic nets that integrate various and heterogeneous information sources to represent information and better understanding. Developers working with data science on graphs will be able to put their knowledge to work with this practical guide to engineering and machine learning. The book illustrates the efficient use of the approaches and methodologies that will have you up-and-running in production, make informed decisions by identifying the strengths and weaknesses of different tools. Here is more information on how we used it in Excel:
http://agafonovslava.com/page/graph-knowledge-in-microsoft-excel
The great advantage here is the flexibility provided by the graph representation of the information, enabling the same data model to serve many use cases and scenarios with small adaptations. Furthermore, all the scenarios can coexist in the same database. This frees the data scientists and data engineers from having to deal with multiple representations of the same information. My new book will be about designing and scaling graph applications because I feel that it is a lot of stuff around how to use them, but not much on how to build and scale them:
- Getting started with Graph Applications
- Harnessing Knowledge Graphs (KG), Web semantics, and Information Extraction
- Graph embeddings, and Named Entity Recognition
- Working with large-scale parallelization tools on graph data
- Batch processing with .NET for Apache Spark and .NET Kafka
- Graph data engineering, modeling, and processing
- Supervised and unsupervised machine learning for graphs
- Deep Learning as a Graph Theory
- Tradeoffs, constraints, and benefits of immense scale graph machine learning
- Field Programmable Gate Arrays (FPGA) and GPU's
- Optimizing techniques for graph processing
- Cost vs. Maintainability in production
- Overview of Various Graph Frameworks
- Overview of Storage Platforms
- Data Version Control (DVC)
There are several ideas of this book: Complete with explanations of crucial concepts, practical examples, and self-assessments, you will start applying ML techniques on the unstructured datasets into the semantically linked graph at scale. You will understand the knowledge graph, build your own, convert it to graph embeddings, and apply machine learning algorithms to extract new information and relationships that was almost impossible with the standard algorithms before. By the end of this book, you will learn how to graph analytics reveal more predictive elements in today's data, and how to create ML workflow for link predictions and clustering.
Data by itself is completely useless on its own, it doesn’t provide any value. To make sense of the data, we have to interact with it and organize it. This process produces information. Turning this information into knowledge, which reveals relationships between information items—a quality change—requires further effort. This transformation process “connects the dots,” causing previously unrelated information to acquire sense, significance, and logical semantics. From knowledge come insight and wisdom, which are not only relevant but also provide guidance and can be converted into actions: producing better products, making users happier, reducing production costs, delivering better services, and more. This is where the true value of data resides, at the end of a long transformation path—and machine learning provides the necessary “intelligence” for distilling value from it.
This book will also cover how to develop intelligent solutions to enhance your applications with machine learning and discover new value in the graph analytics domain. As examples of applications in this book, we will be focusing on NLP applications. There are several parts of the book, for example, we will cover an overview of graph applications, what it is, and why it is gaining momentum. Infrastructure for knowledge graphs to be able to have triples and represent entities like nodes attributes as node labels, and the relationship between objects as edges will be discussed. Basic understanding of data processing with NLP techniques, machine learning, and graph theory will be covered. We will explore the core tools and techniques required to build a vast range of powerful NLP apps to help computers better understand humans. Furthermore, building systems using RDF data with non-relational and relational databases will also be covered. Readers will also get a formal introduction to ontology and the difference between storing and relating information in relational and graph databases.
Please let me know what you want me to cover in such a book?