Cambridge Quantum (CQ) has released what it said is the world’s first toolkit and library for quantum natural language processing (QNLP), allowing sentences to be converted into a quantum circuit. The toolkit, called lambeq, is named after the late mathematician and linguist Joachim Lambek, and is designed to accelerate development of practical, real-world QNLP applications, such as automated dialog, text mining, language translation, text-to-speech, language generation and bioinformatics.
It has been released on a fully open-sourced basis for the benefit of the world’s quantum computing community and rapidly growing ecosystem of quantum computing researchers, developers and users. lambeq works seamlessly with CQ’s TKET, a quantum software development platform that is also fully open-sourced. This provides QNLP developers with access to the broadest possible range of quantum computers.
lambeq was conceived, designed and engineered by CQ’s Oxford-based quantum computing research team led by chief scientist Bob Coecke, with senior scientist Dimitrios Kartsaklis, as chief architect of the platform. lambeq, and QNLP more broadly, is the result of a research project stretching back over a decade.
Coecke told embedded.com that artificial intelligence (AI) is not very transparent, which means it’s very hard to see why something happens. “The major problem is that the old mathematics of AI, the structural mathematics that people have been using, doesn’t fit with the current data-driven approaches to AI. So, what is needed is a completely new kind of mathematics, that is able to combine data-driven methods with reasoning methods.”
“The way words combine with meanings actually perfectly matches the way systems compose quantum mechanics. The typical quantum mechanical formula is also not very transparent, it’s very much like statistical theory. So, we have this new formula for quantum mechanics which is much more structure driven. It’s actually a pictorial, diagrammatic language. Natural language exactly matches this language, and the product we have launched, lambeq, is the first implementation of that idea.”
“Our team has been involved in foundational work that explores how quantum computers can be used to solve some of the most intractable problems in artificial intelligence. This work was based on advances originally pioneered by me, Steve Clark, now CQ’s Head of AI, and others. NLP sits at the heart of these investigations. The release of lambeq is the natural next step after the publication a few months ago that provided details of the world’s first QNLP implementation by CQ on actual quantum computers, and our initial disclosure of the foundational principles in December 2019.”
Coecke added, “In various papers published over the course of the past year, we have not only provided details on how quantum computers can enhance NLP but also demonstrated that QNLP is ‘quantum native,’ meaning the compositional structure governing language is mathematically the same as that governing quantum systems. This will ultimately move the world away from the current paradigm of AI that relies on brute force techniques that are opaque and approximate.”
lambeq enables and automates the design and deployment of NLP experiments of the compositional-distributional (DisCo) type that CQ scientists have previously described. This means moving from syntax/grammar diagrams, which encode a text’s structure, to either (classical) tensor networks or quantum circuits implemented with TKET, ready to be optimized for machine learning tasks such as text classification. lambeq has a modular design so that users can swap components in and out of the model and have flexibility in architecture design.
lambeq removes the barriers to entry for practitioners and researchers who are focused on AI and human-machine interactions, potentially one of the most significant applications of quantum technologies. TKET has gained a worldwide user base now measured in the hundreds of thousands. CQ said lambeq has the potential to become the most important toolkit for the quantum computing community seeking to engage with QNLP applications that are amongst the most important markets for AI. A key point that has become apparent recently is that QNLP will also be applicable to the analysis of symbol sequences that arise in genomics as well as in proteomics.
Merck Group, a launch partner and early adopter of lambeq, recently published a research paper on QNLP as part of a project with the innovation program Quantum Entrepreneurship Laboratory from the Technical University of Munich.
Thomas Ehmer from Merck’s IT healthcare innovation incubator and co-founder of the Quantum Computing Interest Group, said, “Using the unique features of quantum computing for fundamental breakthroughs is an important part of our research at Merck. Our recently disclosed project in QNLP with researchers from TU Munich has proven that binary classification tasks for sentences using QNLP techniques can achieve results comparable even at this stage to existing classical methods. Clearly, the infrastructure around quantum computing will need to advance before these techniques can be employed commercially. Critically, we can see how the approach employed in QNLP opens the route towards explainable AI, and thus to more accurate intelligence that is also accountable – which is critical in medicine.”
“There is a lot of interesting theoretical work on QNLP, but theory usually stands at some distance from practice,” said Kartsaklis. “With lambeq, we give researchers the opportunity to gain hands-on experience on experimental aspects of QNLP, which is currently completely unexplored ground. This is a crucial step towards reaching the point where practical, real-world NLP applications on quantum hardware become a reality.”
lambeq has been released as a conventional Python repository on GitHub and is available here. The quantum circuits generated by lambeq have thus far been executed and implemented on IBM quantum computers and Honeywell Quantum Solutions’ H series devices. The toolkit is introduced by a technical report uploaded available here.
How does QNLP work?
lambeq is an open-source software toolkit which turns sentences into quantum circuits, ready to be implemented on existing quantum hardware. The lambeq toolkit enables both professionals and enthusiasts to linguistically interact with quantum computers.
A brief overview of the pipeline and components of the toolkit are as follows. The pipeline is as below:
The first step in the process is to parse a sentence. For selected compositional models a syntax tree is produced with the help of a statistical CCG (combinatory categorial grammar) parser. The next step is to convert the parse tree into a string diagram. A string diagram expresses amongst other things the grammatical structure of a sentence. For example, the sentence:
“We are explaining how lambeq works” in string diagram format becomes:
The python library DisCoPy is used by lambeq as a backend to store and manipulate these string diagrams.
Following on from this the string diagram can be simplified or transformed by the application of rewrite rules. One might want to do this for example to make the diagram easier to transform to a suitable circuit for the currently available quantum hardware. A re-written string diagram is then converted into an actual quantum circuit or tensor network, depending on the choice of whether it is executed on a quantum or classical computer, respectively. This conversion is conditioned on the user’s choice of ansätze — a pre-defined selection of which are available within the toolkit.
As an example, the quantum circuit for the sentence “we are explaining how lambeq works” is:
This output can then, via a quantum compiler like CQC’s tket that is part of CQ’s open-source quantum computing SDK be guided towards a quantum simulator or one of the increasingly many available quantum computers. In the case of a tensor network for a classical experiment the output can be passed to a ML library such as PyTorch or Jax.
The open-source Python code for lambeq is on the github repository. Some examples are available in the paper, “How to make qubits speak”.
- CQ makes TKET quantum software development kit open-source
- Riverlane trials universal OS for quantum computers
- Cryo-CMOS IP enables qubit control chips at cryogenic temperatures
- Standards bodies need to keep an eye on the future