Knowledge as Infrastructure – and a National Research Resource
We are amid three revolutions in AI: deep learning, knowledge graphs, and automated reasoning, as articulated by Prof. Kenneth Forbus during the AI Session at the National Science Board Meeting on May 1, 2024. Technology in each of these areas has progressed to the extent that it is used daily and is delivering significant value. While deep learning, which has unleashed powerful tools like ChatGPT is a large focus of today’s AI, knowledge graphs and reasoning are complements to deep learning. Knowledge graphs have already been powering key web and e-commerce services provided by large corporations. Automated reasoning techniques are in wide use in many areas—for example, in the auto industry to find flaws in its electrical systems and verify behavior with respect to design specifications. In this talk, we will discuss techniques, technologies, and strategies for building out the knowledge infrastructure that will be essential and necessary to support the next generation of AI. The National Science Foundation’s Open Knowledge Network (OKN), envisioned as part of NSFs Harnessing the Data Revolution Big Idea, provides an important start. The Prototype-OKN (https://www.proto-okn.net/) is a coordinated effort among 18 projects funded under the NSF Proto-OKN project, in collaboration with NASA, NIH, NIJ, NOAA, USGS and other agencies, to create an open knowledge network structure linking disparate, heterogenous information from diverse sources. Of these, 15 projects are “Theme 1” efforts focused on providing solutions for specific problems working in conjunction with federal, state or other government agencies. Two projects—FRINK and SPIDER—are working towards providing a common Proto-OKN “fabric”, or technical infrastructure, to deploy Theme 1 solutions, and one project—EduGate—focusing on the education, training, and outreach for this effort. Knowledge representation and reasoning are not new topics in Computer Science. However, to make translational impact in real-world applications we need to make a shift towards treating the development and operation of computer science-based knowledge structures as infrastructure—in addition to just bespoke solutions to specific problems. This would begin with the hosting of repositories of knowledge in computational form based on open-source government, science, and other open data. The term “translational” implies focus on use-inspired, solution-oriented R&D efforts with the potential to directly impact people's everyday lives. An important focus is on education and training and teaching of knowledge representation at every level from high school to computer science undergraduates, and above. A key consideration is to pay attention to issues from governance to ethics to help build trust in the system.