-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
State synthesis for quantum devices #2291
base: main
Are you sure you want to change the base?
Conversation
623ab4b
to
4517712
Compare
I, Ben Howe <[email protected]>, hereby add my Signed-off-by to this commit: 86681ef Signed-off-by: Ben Howe <[email protected]> Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
4517712
to
7969a75
Compare
Signed-off-by: Anna Gringauze <[email protected]>
…antum-device-state
Signed-off-by: Anna Gringauze <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
…antum-device-state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still reviewing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still reviewing...
…antum-device-state
Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
…ate-ops Signed-off-by: Anna Gringauze <[email protected]>
…antum-device-state Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
Signed-off-by: Anna Gringauze <[email protected]>
…antum-device-state Signed-off-by: Anna Gringauze <[email protected]>
…antum-device-state Signed-off-by: Anna Gringauze <[email protected]>
I, Anna Gringauze <[email protected]>, hereby add my Signed-off-by to this commit: 9563371 Signed-off-by: Anna Gringauze <[email protected]>
…antum-device-state Signed-off-by: Anna Gringauze <[email protected]>
StrAttr:$numQubitsFuncName, | ||
StrAttr:$initFuncName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StrAttr:$numQubitsFuncName, | |
StrAttr:$initFuncName | |
FlatSymbolRefAttr:$numQubitsFuncName, | |
FlatSymbolRefAttr:$initFuncName |
If these are truly artifacts that shall be present in the IR, let's make them Symbol attrs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the signatures of those functions are changed by the synthesis, would this instruction get updated with symbols with new signatures during application of the substitution? I can try that and see if the synthesis works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe that'll be an issue. That is, you can update the signature independently of the symbol.
let arguments = (ins | ||
StrAttr:$numQubitsFuncName, | ||
StrAttr:$initFuncName | ||
); | ||
let results = (outs cc_PointerType:$result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want to force this to be a ptr<state>
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do, thanks
createArgumentSynthesisPass(const std::vector<std::string> &funcNames, | ||
const std::vector<std::string> &substitutions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this? ArrayRef
should subsume rigid std::vector
here. I don't think we need this overload?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::vector<string>
does not get auto-converted to ArrayRef<StringRef>
... I can try ArrayRef<std::string>
instead
return calleeConverters; | ||
} | ||
|
||
std::pair<std::vector<std::string>, std::vector<std::string>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use SmallVector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think i tried and something went wrong, will try again
llvm::raw_string_ostream ss(substBuff); | ||
ss << argCon.getSubstitutionModule(); | ||
mlir::SmallVector<mlir::StringRef> substs = {substBuff}; | ||
auto [kernels, substs] = argCon.collectAllSubstitutions(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you got in trouble here by moving the strings into another function, where they go out of scope and vanish.
Let's leave the string creation here, where it is. To change it might look a little nicer, but it is far less efficient. Instead of a single copy of the data, we're building vectors of copies of the data and passing those around. The LLVM Way (tm) is preferred in the compiler code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nested argument converters (for kernels used in get_state calls) have their new kernel names, created during state argument conversion, so I moved the kernel name storage to the ArgumentConverter for the kernel.
I thought about an alternative of having a special "storage" for the new names collection and passing that storage by reference to ArgumentConverter and its children, but that does not seem to solve the problem of efficiency of copying the name for the entry kernel. I appreciate any ideas here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try to be more clear.
The LLVM way to do this sort of thing is to build the string and/or vector data on the stack in the calling function. This will be reclaimed once the task we're performing completes.
For the called functions (the call tree below the "top" function where the data was built), we use the LLVM ADTs which are efficient lightweight wrappers around that data. This eliminates the use of heap memory, accidentally calling copy-constructors and destructors, etc. At the "leaf layer" where we create, say, MLIR IR, any data that must be interned is captured into the MLIRContext.
Why is it done this way? Because both LLVM and MLIR use a model of invariant pointers to refer to distinct artifacts in the overall representation. If we have an Op*
, then that pointer value will never change. The Op
may be logically replaced, by updating all references in the IR, but the Op
itself will never be moved/copied someplace else in memory. This applies to Type values, string data, etc. Once it's in the IR, think of it as semi-permanent or congealed into a memory location, if you will. Because of all that, using STL containers and std::string
is likely because the code isn't really structured in the spirit of the underlying LLVM model.
Of course this model imposes a bit of structure on how we write the code and implement things. It limits the types of refactoring we can do. But that's ok, because we can do it the LLVM Way, be efficient, and avoid bugs.
It takes two kernel names as ASCIIZ string literals: | ||
- "num_qubits" for determining the size of the allocation to initialize | ||
- "init" for initializing the state the same way as the original kernel | ||
passed to `cudaq::get_state`) as ASCIIZ string literal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
passed to `cudaq::get_state`) as ASCIIZ string literal | |
passed to `cudaq::get_state`) as an ASCIIZ string literal |
- "init" for initializing the state the same way as the original kernel | ||
passed to `cudaq::get_state`) as ASCIIZ string literal | ||
|
||
And returns the quantum state of the original kernel passed to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And returns the quantum state of the original kernel passed to | |
This operation will return the quantum state of the original kernel passed to |
passed to `cudaq::get_state`) as ASCIIZ string literal | ||
|
||
And returns the quantum state of the original kernel passed to | ||
`cudaq::get_state`. The operation is replaced by calls to the kernels with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`cudaq::get_state`. The operation is replaced by calls to the kernels with | |
`cudaq::get_state`. The operation may be replaced by calls to a kernel, which will reproduce the specified state and whose name is provided |
|
||
And returns the quantum state of the original kernel passed to | ||
`cudaq::get_state`. The operation is replaced by calls to the kernels with | ||
the provided names in `ReplaceStateByKernel` pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the provided names in `ReplaceStateByKernel` pass. | |
by the `ReplaceStateByKernel` pass. |
|
||
```mlir | ||
%0 = quake.get_state "callee" : !cc.ptr<!cc.state> | ||
%0 = quake.get_state "num_qubits" "init" : !cc.ptr<!cc.state> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these two parameters need to be an IntegerAttr and a Symbol. The syntax ought to look like
quake.get_state @init_artifact, <n> (attributes { ... })?
We don't even need the type, since I don't think it can be anything other that !cc.ptr<!state>
, right?
"Replace `quake.init_state` instructions with call to the kernel generating the state"; | ||
let description = [{ | ||
Argument synthesis for state pointers for quantum devices substitutes state | ||
argument by a new state created from `__nvqpp_cudaq_state_get` intrinsic, which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
argument by a new state created from `__nvqpp_cudaq_state_get` intrinsic, which | |
a new state created from the `__nvqpp_cudaq_state_get` intrinsic for the state argument. |
let description = [{ | ||
Argument synthesis for state pointers for quantum devices substitutes state | ||
argument by a new state created from `__nvqpp_cudaq_state_get` intrinsic, which | ||
in turn accepts the name for the synthesized kernel that generated the state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in turn accepts the name for the synthesized kernel that generated the state. | |
The `__nvqpp_cudaq_state_get` intrinsic accepts the symbol for the synthesized kernel that generated the state. |
|
||
This optimization completes the replacement of `quake.init_state` instruction by: | ||
|
||
- Replace `quake.init_state` by a call that `get_state` call refers to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Replace `quake.init_state` by a call that `get_state` call refers to. | |
- Replacing `quake.init_state` by a call to a kernel to construct the state. e.g., The `cudaq::get_state` call refers to the result of a specific quantum kernel being invoked with a set of parameters. |
This optimization completes the replacement of `quake.init_state` instruction by: | ||
|
||
- Replace `quake.init_state` by a call that `get_state` call refers to. | ||
- Remove all unneeded instructions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Remove all unneeded instructions. | |
- Remove any unneeded Ops. |
I think we only really need to remove Op
s that have side-effects. Any that are pure and unused will just go away with the next DCE that runs.
@@ -163,3 +163,12 @@ cudaq::opt::createArgumentSynthesisPass(ArrayRef<StringRef> funcNames, | |||
return std::make_unique<ArgumentSynthesisPass>( | |||
ArgumentSynthesisOptions{pairs}); | |||
} | |||
|
|||
std::unique_ptr<mlir::Pass> cudaq::opt::createArgumentSynthesisPass( | |||
const std::vector<std::string> &funcNames, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, we should see if we can't get rid of this overload and use the original function with the LLVM ADT signature.
if (!cudaq::get_quake_by_name(cudaq::getKernelName(kernel), false).empty()) { | ||
return state(new QuantumState(std::forward<QuantumKernel>(kernel), | ||
std::forward<Args>(args)...)); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -21,7 +21,7 @@ class FakeSimulationState : public cudaq::SimulationState { | |||
virtual std::unique_ptr<SimulationState> | |||
createFromSizeAndPtr(std::size_t size, void *data, | |||
std::size_t dataType) override { | |||
std::runtime_error("Not implemented"); | |||
throw std::runtime_error("Not implemented"); | |||
return std::make_unique<FakeSimulationState>(size, data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return is never reached, right?
"Substitution module:\n" | ||
<< ab.getSubstitutionModule() << '\n'; | ||
// Dump the modified source module | ||
llvm::outs() << "Source module (after):\n" << *mod << '\n'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this dumping the same thing? It looks like it was dumping the substitution module but now is dumping the module receiving the substitutions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is testing argument conversions, we must care about building the substitutions properly. The code into which they are applied is only interesting at the point of argument synthesis, no?
// The state is represented by a quantum kernel. | ||
// Quantum state contains all the information we need to replicate a | ||
// call to kernel that created the state. | ||
class QuantumState : public cudaq::SimulationState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think something like this should be in the qis
folder. Would be best to move it to somewhere in the platform folder
, I'd also suggest to name it something like RemoteQPUSimulationState
.
As-is, QuantumState
is very generic and putting it in the qis
folder (in the public cudaq
namespace) makes one think it is more of a language-level thing, when really its an internal implementation detail.
auto argPos = initFunc.getArguments().size(); | ||
|
||
// Detect errors in kernel passed to get_state. | ||
std::function<void(Block &)> processInner = [&](Block &block) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::function<void(Block &)> processInner = [&](Block &block) { | |
auto processInner = [&](Block &block) { |
for (auto ®ion : op.getRegions()) { | ||
for (auto &b : region) | ||
processInner(b); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for (auto ®ion : op.getRegions()) { | |
for (auto &b : region) | |
processInner(b); | |
} | |
for (auto ®ion : op.getRegions()) | |
for (auto &b : region) | |
processInner(b); | |
processInner(b); | ||
|
||
// Process outer block to initialize the allocation passed as an argument. | ||
std::function<void(Block &)> process = [&](Block &block) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::function<void(Block &)> process = [&](Block &block) { | |
auto process = [&](Block &block) { |
@@ -99,12 +99,245 @@ static Value genConstant(OpBuilder &, cudaq::cc::StructType, void *, | |||
static Value genConstant(OpBuilder &, cudaq::cc::ArrayType, void *, | |||
ModuleOp substMod, llvm::DataLayout &); | |||
|
|||
/// Create callee.init_N that initializes the state | |||
/// Callee: | |||
/// func.func @__nvqpp__mlirgen__callee(%arg0: i64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is callee
supposed to be the kernel that is captured by get_state
?
|
||
auto argTypes = calleeFunc.getArgumentTypes(); | ||
auto retTy = quake::VeqType::getUnsized(ctx); | ||
auto funcTy = FunctionType::get(ctx, argTypes, TypeRange{retTy}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we returning qubit references? From the above comments, it looks like the references are passed into this new function as an argument.
Returning a set of new references violates the assumption that the entry-point kernel is strictly responsible for creating all the (non-temporary) qubits for the circuit. (In other words, other passes will be broken, if this rule is not followed.)
throw std::runtime_error( | ||
"cudaq::state* argument synthesis is not supported for quantum hardware" | ||
"for c-like functions, use class kernels instead"); | ||
} | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#endif | |
#else |
and wrap the return in the else conditional block...
#include "cudaq/qis/state.h" | ||
#include <cassert> | ||
#include <memory> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
{ | ||
auto kernel = "init"; | ||
auto kernelCode = | ||
"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"" |
{ | ||
auto kernel = "init"; | ||
auto kernelCode = | ||
"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"" |
builder.getStringAttr(initKernelName)); | ||
} | ||
|
||
TODO("cudaq::state* argument synthesis for quantum hardware for c functions"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why doesn't this use cudaq::qkernel
? That's the vehicle for capturing quantum kernels in general-purpose C++ code contexts. Since cudaq::get_state()
may appear anywhere, I don't understand how we know what kernel is being captured.
There is the additional issue of captured arguments as well...
I'm going to have to experiment with this on my machine. I'm not sure I get what's going on... For example, int main() {
// This is not quantum code and will never be seen by our compiler.
...
KernelA a; // our compiler sees KernelA's call operator.
a(arg0, arg1, ... argN); // but not this call
...
auto state1 = cudaq::get_state(a); // is this ok per the spec?
...
// f is a plain old function marked with __qpu__, so compiler sees it
...
auto state2 = cudaq::get_state(f, frg0, frg1, ... frgN); // I think this must be supported
....
KernelB b; // our compiler sees KernelB's call operator
b(brg0, brg1, ..., state1, state2); // but not this call
...
} So our implementation must capture the context to fully recreate the kernel launches of If we focus on Now "argument synthesis" is sort of a misnomer at this point. So unless I am missing something, it seems like this can be implemented without argument synthesis and as a standalone compiler pass, no? (Argument conversion would need to know that the In a bit of a cosmic twist, it is actually |
return | ||
} | ||
|
||
func.func @init(%arg0: i64, %arg1: !quake.veq<?>) -> !quake.veq<?> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these init
kernels, the qubit vector is passed as references, which means the qubits passed in will be changed by the function in-place (or remain unchanged if unreferenced, I suppose). I don't think returning that same set of qubit references is needed.
return %arg1 : !quake.veq<?> | ||
} | ||
|
||
func.func @num_qubits(%arg0: i64) -> i64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we get to the phase of building the substitutions, we should know the number of qubits as a constant, right?
@@ -21,7 +21,7 @@ class FakeSimulationState : public cudaq::SimulationState { | |||
virtual std::unique_ptr<SimulationState> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This header file and FakeQuantumState.h
are really just scaffolding for the test and aren't included anywhere else. We could get rid of these .h files and move the code into the file test_argument_conversion.cpp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, most of the code for these two classes is the same or very similar and could be shared or templated.
Synthesize state pointer for quantum devices:
SimulationState:
hasData
API that returns true iff the vector data exists on the state or can be computed.getKernelInfo
API that returns optional kernel name and a list of arguments for the kernel that generated the state.QuantumState:
ArgumentConversion:
quake.get_state "callee.num_qubits_N" "callee.init_N"
instruction.callee.num_qubits_N
andcallee.init_N
functions that are created from the callee code and compute allocation size and initialize the allocation, respectively.ArgumentConverter
.Passes:
ReplaceStateWithKernel
pass thatquake.get_num_qubits
instructions by a call tocallee.num_qubits_N()
quake.init_state
instructions by a call tocallee.init_N()
Synthesis:
Tests:
test_argument_conversion
ReplaceStateWithKernel
passArgumentSynthesis
pass with state pointer substitutionstargettests/execution/qvector_init_from_state.cpp
test that runs for various quantum backends emulationNotes
Currently state pointer synthesis is only supported for kernels implemented as
operator()
inside a struct.TODO (in subsequent PRs)
cudaq::qkernel
to support c-like kernels in state pointer synthesisRequires: #2354