How to improve the RISC-V specification

(alastairreid.github.io)

162 points | by todsacerdoti 13 days ago

16 comments

acuster 13 days ago
You are quite right that the document that 'specifies' RISC-V remains a key weakness in the whole movement.
For expediency, the choice was made to not sweat it. So the document is actually called a 'Manual' but is linked as being the specification. Even so, the document needs a real editor to review it. For example, the preferred bit pattern which is to be processed by an implementation as doing nothing but incrementing the program counter ('no op') is called an 'instruction' in some sections but is clearly not in others---a dumb discrepency. A review by a good technical editor would be a great first step in improving the document.
However, the greater tragedy is that a great 'specification' for RISC-V would be an invaluable educational document. This would be a very hard document to write. No document that I could find has ever tried to specify an instruction set independent of an actual implementation. So there is no roadmap towards writing a good spec for RISC-V. This is surely one of the reasons the effort has not yet been started.
After a couple of months trying to imagine how such an effort could be undertaken, how one could argue that the effort was worth trying, and how I might convince the community of the value and need for a good spec, I gave up. The work would require a team combining very fine technical knowledge with exeedingly accurate control of technical english. The work would be a multi-person-year effort, requiring concomitant funding. It is not clear to me how this work might begin.
Also you are entirely right to think about the test suite as a central concern. Specifications are strange documents. Some specifications make requirements which can not be tested; this affects the very nature of what is being 'specified'. Others have tried to root every injunction in the test suite; that approach leads to its own difficulties. The specification will have to make its choice on the matter and the authors would benefit from being very clear with themselves about what stance they are taking on the matter.
So thanks for your argument for a better specification; it would be a wonderful addition to the open instruction set. Hopefully, somehow, such an effort finds its wings.
[-]
- gchadwick 13 days ago
  > No document that I could find has ever tried to specify an instruction set independent of an actual implementation.
  What do you mean by this? I'd say most ISA specifications do this (e.g. the Arm spec doesn't refer to Arm's CPU implementations and has well defined ways to discuss things that can be implementation dependent).
  [-]
  - acuster 12 days ago
    Sorry, it's three in the (sunday) morning, and I've been hitting the whysky trying to handle the estabilshment journalists having fun, while other journalists are talking about humans struggling to get water while themselves being asked if they will survive the night. ---I'm not at my best.
    You're right to call me on my statement; I should have all my notes on hand to make that claim and I don't. Paah, no, I do: ARMv7-M Architeture Reference Manual ... Part A Application Level Architecture ... ...processor in Thread mode (vs. in Handler mode).
    So ARM already has a lot of detail whereas the RISC-V architecture is trying to (has to?) start even more abstract, where code doesn't even have modes (no interrupts).
    This all started a pandemic saturday morning, cup of coffee in hand, enthusiasm to read the "RISC-V Spec" and see what I could learn. Download. Confusion: it says "manual," did I get the right thing? ... Ok, yeah, that's what's on offer. Half an hour later, I'm actually pissed off, like actively angry. I'm reading this from the point of view of "what's the execution environment that I'll be working against?" and I'm getting hit with "unprivileged" which is just wrong. It turns out they are mixing up "the environment of general purpose programmers" with "the minimal that needs to be implemented"---it's a royal mess, they kindda give up on it in the middle. I'm angry about being asked to read this as "the product"; it's not even properly proof-edited. So I took my frustration and tried to figure out 'what would you do to make this better?'
    The 'RISC-V' spec is trying to specify: [instructions], and what they do to the [architecture]. I don't know much about the details, but I have a notion that there was push back on writing this up as a 'state machine' and how each instruction might change that state. I assume Prof. Asanović had his own good reason to avoid framing things that way but he's yet to give us a good explantion of why. So probably he's right, I just don't know why.
    So how could this be done?
    I went to look at the history. The original x86 spec was tied to the chip they were trying to sell. PowerPC, MIPS, if I remember right, were not 'specified' in a clean way--none of them had the same challenge as RISC-V does, starting in pure execution environment mode. I went to read the infamous von Newmann writeup and got side-tracked by his virtural neurons but didn't find the right level of abstraction there either.
    So, I'm sorry I can't really justify myself here, but this is all subtle and hard. From what I have found, I don't think anyone has faced the challenge that RISC-V faces, so I don't think we have a roadmap for the spec that RISC-V ought to have.
    cheers
    [-]
    - thechao 12 days ago
      I really think something like Ghidra/SLEIGH and a formal specification of the p-code could help; but, only if the following things happened:
      1. A p-code parser front end in C existed;
      2. An alternate XML/JSON version of SLEIGH; and,
      3. A way to integrate the above to document (book) generation.
      For the latter I'd prefer HTML. I've found the SLEIGH spec, itself, heavy enough going that I can't tell if it supports full constraint specifications, or not.
      [-]
      - IAmLiterallyAB 12 days ago
        > full constraint specifications
        Is that a technical term? (if so, can you explain further)
        I've made SLEIGH specs for two architectures. In my experience, it can describe 95% of the semantics well enough for decompilation (it gets weird when your ISA has quirks). Not as comprehensive as SAIL appears to be
        Also, SLEIGH compiles to an XML format which is what Ghidra actually uses
        [-]
        thechao 12 days ago
        CPUs are fairly orthogonal in terms of capabilities; if the instruction can encode it, the CPU can interpret it. Coprocessors (GPUs, NPUs, etc.) have ISA where the legal encoding space is much larger than a the legal instruction space: the set of valid instructions is not dense in its own encoding space. This smaller legal space is defined by a set of constraints on the set of legal encodings.
- acuster 12 days ago
  (A better, more considered, response than I gave yesterday.)
  Alastair Reid's article "How to improve the RISC-V specification" makes some great points: namely that the RISC-V specification needs improvement, (implicitly) that this work is worth doing, and that testing is integral to specification. These are all great points.
  However, a new specification effort for RISC-V requires a much greater effort.
  The core document certainly needs to be rewritten with better structure, a more formal presentation of each 'instruction,' better clarity on what it is saying, and fixes to numerous errata. The harder 'fix' requires resolving the tension between its two readerships --- the implementors who must process instructions according to the requirements of the spec versus the coders who need to know what they can expect from the environment in which their instuctions are processed. (The 'unpriviledged' element in the subtitle of the spec is the code being executed.) Problematically, this tension might not be resolvable: since the core instuction set has no side effects and no requirements can be placed on how implementations are made, the spec inherently has no way to express knowledge of whether the code has been processed correctly! Fun.
  A new specification effort has a bigger task than fixing, as best as possible, the current document. Alastair Reid's article mentions needing to integrate testing more centrally into the specification. There is also the need to think of this core spec as the foundational document on which to build the whole suite of RISC-V specification documents. A new spec also needs to cater to the even wider readership, beyond 'implementors' and 'coders,' that accompanies projects with wide-spread success. For example, the spec ought to serve companies or governments in their procurement contracts, so they can express what is meant by deliverables which are conformant with the specification. Ideally, this wider readership under consideration would also include 'students,' that is smart people who have less a priori knowledge of the domain than that which the original 'manual' was able to assume.
  This is all a lot of work, which would be of great benefit to the community but which no one in the community has any reason to, or really could, take on. Ideally, the RISC-V Foundation would scope out the work, take a position on what parts of the effort were worth pushing for, and then make that happen.
  [-]
  - adreid 10 days ago
    Fixing the whole problem is large but there is a small sequence of steps that would improve things a lot that can be taken.
    Specify the key formats in a json/xml file then go one by one through tools, docs, etc changing them to use the machine readable file.
- gumby 12 days ago
  > No document that I could find has ever tried to specify an instruction set independent of an actual implementation.
  The most famous example is MIX.
  [-]
  - weebull 12 days ago
    ...because those are normally proprietary documents.
    ARM, for example, put a tonne of effort into decoupling their ISA specification from their implementations, and guard this document.
    You're seeing how the sausage is made.
- pstoll 12 days ago
  > The work would be a multi-person-year effort, requiring concomitant funding.
  > It is not clear to me how this work might begin.
  To me that screams for - develop interest in a US public sector (aka government) funding agency and start getting grants to do this type of work.
  I assume RISC-V can reasonably be pitched to government tech & policy folks as having many directly applicable benefits to their set of problems in a way they may want to invest in. To procure RISC- systems, they’d probably want a bunch of verification done. If the community isn’t investing, maybe they would.
  Caveat I’m not current on the state of RISC-V adoption in public sector systems. But I’ve done funded systems development work at start ups with DARPA/ARPA, InQTel, etc. A motivated party should be able to make this happen.
  Anyone closer to RISC-V know what the state of adoption in public sector?
  [-]
  - gumby 12 days ago
    > To me that screams for - develop interest in a US public sector
    I disagree 100%.
    (note I'm a big fan of the RISC V effort)
    First, the US government can/should seed pieces of fundamental technology that the private sector may later use, but in general you really don't want the government funding direct competitors to competitive technology already in the private sector (boosting RISC V when ARM, x86 etc are widely available). On a practical level, even if you do think the USG should do such a thing it won't happen.
    Look at the members of RISC V International -- they are mostly huge multinationals who could both afford and benefit from such work. In fact the author of the post works for Intel. They are they people who should be paying for it (the smaller members, like Sifive, and mostly bleeding money).
    There is one hack that could work, and might even be the most feasible: get DARPA to fund the pedagogical document acuster described. That to me is similar to previous successful research-supporting projects DARPA has funded in the past, such as SPICE. But that's a special case.
    And anyway, why USG? Where are China and Europe in this? The EU funding this kind of pedagogical document would be a good step towards get its chip mojo back.
- Pet_Ant 12 days ago
  This sounds like the perfect case for crowd founding. The community would likely support it. I bet some industry partners would support it. If you had a reasonable team capable throw it up there. $1M means 3 people at $330k, that seems feasible.
gchadwick 13 days ago
Another issue I take with the RISC-V spec is it relies on a common understanding of technical terms without actually defining them precisely anywhere.
To take one example it never defines what an interrupt is and more broadly never defines terminology around exceptions. Contrast to the arm ISA which precisely describes what it means by asynchronous Vs synchronous, precise Vs imprecise etc (see section D3-1 in https://developer.arm.com/documentation/ddi0487/latest/).
The original authors may see this as a virtue, the small size of the RISC-V ISA manuals Vs Arm was portrayed as a great benefit but in part that size is because it's missing lots of stuff like this that I view as highly important for a specification.
[-]
- timhh 12 days ago
  Yeah I completely agree. Especially annoying if RISC-V is the first ISA you've learnt which is probably the case for a lot of people.
  I don't think you meant D3-1 btw.
  [-]
  - gchadwick 12 days ago
    Ah yes I meant D1-3!
sweetjuly 13 days ago
This is one of the things I think is most sorely missing from RISC-V. ARM provides executable (but perfectly legible) pseudocode for every instruction. You don't have to rely on natural language to understand what an instruction does, which is really important when dealing with very complex ISA features which have many different (and sometimes contradictory) extensions. SAIL sort of fulfills this purpose if you squint but it doesn't feel like a specification like ARM pseudocode so much as a theorem proving language which happens to be the reference for the ISA.
[-]
- Peter_Sewell 12 days ago
  Here is a piece of the Arm-A ASL:
```
    (result, -) = AddWithCarry(operand1, imm, '0');
    if d == 31 then
      SP[] = ZeroExtend(result, 64);
    else
      X[d, datasize] = result;
```
  and here is some analogous Sail (automatically translated from the ASL, in fact):
```
    (result, nzcv) = AddWithCarry(operand1, operand2, carry_in);
    if d == 31 then {
        SP_set() = ZeroExtend(result, 64)
    } else {
        X_set(d, datasize) = result
    }
```
  I'm somewhat at a loss to understand in what way that's less readable?
  (There surely are things that can be improved, of course - e.g. perhaps the concrete syntax for fancy types - but readability was a primary goal in the Sail design; it's really not a theorem-proving language or something that needs fancy PL knowledge.)
  [-]
  - sweetjuly 12 days ago
    Hmm, thanks for keeping me honest; I took another look after reading your response and I've changed my mind a bit :)
    I think my primary criticism is the way in which the formal specification is surfaced in RISC-V as opposed to ARM rather than any issue with Sail in particular.
    In ARM, the relevant portions of the pseudocode/ASL are available directly in the architecture reference manual. A snippet is either included directly underneath the relevant instruction or there is a link to it. This makes it very easy to use since only the relevant bits are presented when it is being used for reference/human spec reading purposes. This relationship to pseudocode/ASL, of course, makes sense given the history of ASL. The documentation pseudocode was originally just a nice and clearer way to explain how instructions work (free from the trappings of natural language). Later, they cleaned up the pseudocode and created an actual language specification for it so that it could be used as a formal model.
    In RISC-V, we don't really get any such luxuries. If you find yourself confused about how exactly an instruction works, you have to somehow know that the Sail model is the official specification (it is not called out anywhere in the natural language spec!) and then go trawl through the entire model yourself hunting for the definition of your instruction. This isn't super hard if you know what you're doing, but there is a lot of other machinery in these models that you can trip over (it includes instruction binary encoding formats and the assembler language specification as well). I would say that I'm comfortable using the model as a reference but it's definitely not as easy to access as it is over on ARM.
    I see now though that there is an effort to integrate the Sail source into the natural language specification. I hope that makes it eventually, it'll be a big quality of life improvement.
    [-]
timhh 12 days ago
I've been doing a lot of work with Sail (not SAIL btw) and I'm not sure I agree with the points about it.
There's already a way to extract functions into asciidoc as the author noted. I've used it. It works well.
The liquid types do take some getting used to but they aren't actually used in most of the code; mostly for utility function definitions like `zero_extend`. If you look at the definition for simple instructions they can be very readable and practically pseudocode:
https://github.com/riscv/sail-riscv/blob/0aae5bc7f57df4ebedd...
A lot of instructions are more complex or course but that's what you get if you want to precisely define them.
Overall Sail is a really fantastic language and the liquid types really help avoid bugs.
The biggest actual problems are:
1. The RISC-V spec is chock full of undefined / implementation defined behaviour. How do you capture that in code, where basically everything is defined. The biggest example is probably WARL fields which can do basically anything. Another example is decomposing misaligned accesses. You can decompose them into any number of atomic memory operations and do them in any order. E.g. Spike decomposes them into single byte accesses. (This problem isn't really unique to Sail tbf).
2. The RISC-V Sail model doesn't do a good job of letting you configure it currently. E.g. you can't even set the spec version at the moment. This is just an engineering problem though. We're hoping to fix it one day using riscv-config which is a YAML file that's supposed to specify all the configurable behaviour about a RISC-V chip.
I definitely agree about the often wooly language in the spec though. It doesn't even use RFC-style MUST/SHOULD/MAY terms.
[-]
- metaprogram 12 days ago
  I agree on the beauty of liquid types. I encourage HN to learn Sail, simply to get experience with liquid types. The Sail specification of RISCV is currently the only place where liquid types are used in industrial code. (There is Liquid Haskell and Liquid Rust, but they are research prototypes.) I expect that over the next decade or two, liquid types will come to the programming mainstream.
  The configurability problem has several main parts.
  (1) It's a Turing complete problem as the informal RISCV specification gives you so much freedom, including all those IMPDEFs. The specification of a complex, configurable ISA will always be intrinsically complex and syntax can only do so much about presenting this intrinsic complexity.
  (2) Sail's lack of a clear, easy to use module system for configuration of the exact RISCV version to be emulated.
  [-]
  - timhh 12 days ago
    On 2 there is this now:
    https://alasdair.github.io/#_modules
    But it's very new and the RISC-V model doesn't use it yet. I think it's also just a first step rather than fully solving the problem.
- rwmj 12 days ago
  Ssstrict is supposed to address the undefined behaviour problem, or at least it'll make undefined instructions actually trap.
  https://github.com/riscv/riscv-profiles/blob/main/rva23-prof...
  [-]
  - timhh 12 days ago
    That's not really what I meant. It's very easy to configure which instructions or CSRs exist, and excluding custom extensions there are really only two options for behaviour, so you just have a flag `haveSomeExtension()` to enable the instructions and CSRs. The Sail model has some of these flags already.
    If you write to a WARL field in a CSR the chip can use more or less any legalisation logic it wants. Configuring that is very difficult (though there is a decent attempt in riscv-config).
    [-]
    - rwmj 12 days ago
      Yes I don't think Ssstrict is very complete. In fact while I was looking for it, I'm not sure it's been fully defined yet, beyond some postings on the tech-profiles list.
zyedidia 12 days ago
I am excitedly awaiting the full release of ASL1 from Arm. I wonder if anyone with more knowledge might be able to comment on how it compares with Sail and/or when we might expect to see a full Arm specification in ASL1 (as opposed to the current spec which is normal ASL and appears to be incompatible with the upcoming version). Perhaps in the future there might also be a RISC-V specification written in ASL1.
[-]
- Peter_Sewell 12 days ago
  Sail is pretty similar to ASL (both current ASL and ASL 1.0) except that (1) it has a more expressive type system, so that bitvector lengths can all be statically checked, (2) it has proper tagged unions and pattern matching, and (3) there's a wide range of open-source tooling available, for execution, specification coverage, generating emulators, integrating with relaxed concurrency models, generating theorem-prover definitions, etc. We've recently updated the Sail README, which spells some of this out: https://github.com/rems-project/sail .
  As Alastair Reid says, one of the main things missing in the current RISC-V specification documents is simply that the associated Sail definitions are not yet interspersed with the prose instruction descriptions. The infrastructure to do that has been available for some time, in the Sail AsciiDoc support by Alasdair Armstrong (https://github.com/Alasdair/asciidoctor-sail/blob/master/doc...) and older LaTeX versions by Prashanth Mundkur and Alasdair (https://github.com/rems-project/riscv-isa-manual/blob/sail/r...).
- adreid 10 days ago
  Note that herd contains an ASL1 implementation. https://github.com/herd/herdtools7/tree/master/asllib
chrsw 12 days ago
The part about needing to be fluent in programming language research in order to understand the SAIL specification is a great point. I think it speaks to the origins of RISC-V being by and for computer science or computer engineering academics and not for practicing digital designers for commercial systems.
[-]
- hajile 12 days ago
  I believe the first cores in chisel which is a Haskell/ML type HDL, so SAIL fits right in.
  Just because people are used to other ways of doing things doesn’t make those ways better. “I don’t want to learn something new“ certainly isn’t a compelling argument.
  [-]
  - chrsw 12 days ago
    Learning new things is good. This should be encouraged. But going from SystemVerilog to Chisel is not a small step. It's not really a question of what's better or worse. It's a paradigm shift in the way you think about how a design is captured and expressed. It's not like learning how to use a new dishwasher or driving a new car. It's learning a new stack of tools and flows. All under the pressure of delivering a new product on spec, on time and on budget. For many teams running in a tight cycle of design->tape out, it is not realistic to ask them to "learn something new".
    When you change the context, the perception changes. If you're a grad student who has chosen to dive head first into Chisel after already mastering functional programming and want to get more into hardware design, that's a different story.
- Peter_Sewell 12 days ago
  Neither of these are actually the case: one doesn't need to be fluent in PL research to understand Sail specs (see @timhh comment above, forex), and Sail was developed independently of the origins of RISC-V.
artisanspam 12 days ago
> The easiest way to improve this would be to capture as much of the architecture as possible in formats that are easy to read and manipulate. In particular, instruction encodings and control/status registers are easily described by simple JSON/YAML/XML/… formats.
This has been something I wish was available for ARM pseudocode. It’d be ideal to just generate an equivalent Python, SystemVerilog, etc. library from the ARM ARM instead of having to reimplement a subset of it yourself.
[-]
- adreid 10 days ago
  This has been available for arm for some years. Here is a blog post I wrote at around the time it was released. The easiest bits to use would be the instruction formats and the register fields
  https://alastairreid.github.io/dissecting-ARM-MRA/
hlandau 12 days ago
Yeah, the low quality and informal "tutorial" presentation of the RISC-V "specification" has always been very jarring to me given the popularity of RISC-V.
The authoritative description should be machine readable, but the PDF also needs to be authoritative. That means the PDF needs to be generated from the machine readable spec automatically.
I think the highest quality ISA spec in the industry at this point is ARM's ARMv8/9 ISA manual - it's a readable PDF with psuedocode in a well defined language. Every ISA rule is even given a unique identifier. But even better, it's all generated from XML source files which ARM releases, so you can parse the psuedocode out of those and run that code directly if you write an interpreter for it.
I was hoping to see ARM also release this XML for ARMv8-M but last I checked they haven't done so, sadly.
It's highly worthwhile to examine different ISA manuals to compare their relative approaches (e.g. Power, SPARC, ARMv8, Hexagon, s390x, Itanium, etc.). But I think ARMv8 is the best.
IAmLiterallyAB 13 days ago
SAIL reminds me of SLEIGH, the language Ghidra uses to describe ISA semantics. Cool stuff
azubinski 12 days ago
"Conclusion The RISC-V architecture was developed in classic startup/academic style: innovating quickly and avoiding too much investment in long-term engineering"
But this is not a conclusion in any sense. This is the premise of the entire RISC-V project. Because RISC-V is just an abstraction to teach and learn machine-level aspects of programming regardless of the specific processors available on the market. Donald Knuth has the same, MMIX. In addition it is difficult to say what innovations there may be in the RISС processor instruction system.
Well...
drycabinet 13 days ago
Is RISC-V technically superior or is it just about the license?
[-]
- __s 13 days ago
  It's simpler than x86/arm, while having a couple improvements compared to minimal alternatives like MIPS based on hindsight & modern chips
  See chapter 2: https://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-...
- RetroTechie 12 days ago
  It's designed with the luxury of hindsight on longtime-existing ISAs. Avoiding many pitfalls in those. While not attempting to innovate in ways that may or may not work out.
  Also the base ISA is very implementer-friendly. As in: requiring few transistors / FPGA LEs, (relatively) easy to write a compiler or emulator for, etc. But that is hardly unique.
  32b and 64b flavours very similar. Oh and... modular.
  That doesn't make it 'better' though. Eg. x86 has a looott of legacy cruft. But also a looott of high-quality software for it. RISC-V: many of those tools are still being written / adapted / optimized. Likewise, x86 & ARM have many high-performance, efficient and/or low-cost implementations. RISC-V is catching up quickly, but not (yet) head-to-head with those.
- gchadwick 13 days ago
  ISAs are complex things so you can't say one is technically superior to the other.
  I would say you're right though in that RISCV enjoys the success it's seeing due to the open specification and licensing model. People generally aren't drawn to RISCV because of technical innovation.
  [-]
  - brucehoult 12 days ago
    The base specification (IMAFDC) has little to no innovation, simply avoiding the mistakes of the past. We've got 60 years of experience with RISC-style instruction sets, so that's about consolidation not innovation.
    However RISC-V is an excellent base upon which to innovate. You can see that in things such as the Vector extension, the memory model developed by industry and academic experts world-wide, and CHERI fine-grained memory-protection.
    [-]
    - fanf2 12 days ago
      CHERI was originally based on MIPS, then ported to ARM and later RISC-V.
      [-]
      - brucehoult 12 days ago
        The place you can get a real physical commercially available CHERI implementation is RISC-V:
        https://codasip.com/press-release/2023/10/31/codasip-deliver...
        That's largely because if you base a product on Arm or MIPS you have the choice of getting them to actively invest in and support you, or getting sued into oblivion by them.
        THAT is why RISC-V is the most friendly ISA to innovation and where most future innovation will happen. Because innovation comes not only from internally inside Intel or Arm or MIPS (who have switched to RISC-V now anyway) but from a myriad of possible sources.
  - rwmj 12 days ago
    The base RISC-V tends to avoid innovation, because we've been burned in the past (register windows, branch delay slots etc)
- zik 12 days ago
  This article's just about the technical writing of the manual. It's nothing to do with the ISA itself.
- DeathArrow 12 days ago
  Superior in what way? Is English superior to French? I think you can have equally good implementations of modern CPUs regardless of the ISA. The end result is trillions of logical gates that working together will store, add, subtract, multiply at a rate of several millions per second. Logical gates don't care what language you speak to them as long as they understand.
- goodpoint 12 days ago
  It's waaay more modern than other ISAs in its design. It can scale from microcontroller to simple CPU to GPU-like vector processing to very powerful CPUs without having to add thousands of CISC instructions.
  E.g. The ISA is modular. You can use the RV64GC set of instructions to implement a very basic Linux-capable CPU that executes one instruction at a time.
  Then you can build an advanced CPU that does OOO and instruction compression and run the same binary *efficiently*.
ur-whale 12 days ago
A slightly off-topic point but nevertheless worth mentioning (re-iterating in fact): much like ISA's , API's of any kind should first and foremost be described in a machine-readable format.
They unfortunately very rarely are.
And in these days of LLM's and whatnot, the human-readable version should easily be automatically derived from the machine-readable one, not the other way round.
photonbucket 13 days ago
I once thought it would be nice to write a toy riscv isa simulator, but was also surprised and discouraged by the natural language spec
[-]
- s-holst 11 days ago
  Here is one in less than 200 lines of python: https://github.com/s-holst/tinyrv . It uses machine-readable specs from https://github.com/riscv/riscv-opcodes ; yet I needed to extract immediate bit scrambling from their LaTeX sources :). I wonder if there is an easier way. Anyways, the opcode semantics are hand-coded and it simulates enough to boot linux.
- eru 13 days ago
  If you only want to simulate the core instruction, and perhaps the Multiply extension or so, and only plan to support single threaded operation, then the natural language spec ain't so bad.
atVelocet 12 days ago
Would a format like SVD fit his description? And if not: Why?
Reason077 13 days ago
Perhaps it's time for a RISC-VI specification!
[-]
- hajile 12 days ago
  That would upset the church of emacs though.
fwsgonzo 13 days ago
I honestly think the very readable specification has been a boon for RISC-V and possibly part of the reason why people continue to find it easy to pick up. If you are unsure about something in the spec, there's also a multitude of RISC-V emulators out there, probably several in your favorite language already.
[-]
- tredre3 12 days ago
  > If you are unsure about something in the spec, there's also a multitude of RISC-V emulators out there, probably several in your favorite language already.
  I've been working with risc-v assembly a fair amount and sometimes I've had to resort to looking at emulators' and even compilers' code because the documentation is so lacking. Sometimes the emulators do different things in a given scenario. Sometimes real chips do different things in the same scenario. And I suspect the discrepancy is caused by lax documentation rather than implementation failure.
  It doesn't have to be that way.
- gchadwick 13 days ago
  This is a good point though I'd argue that the spec isn't all that readable for some important things.
  For example the description of exception handling is sprinkled between the CSR definitions of the CSRs involved. There's no section that just lays out the exception model.
  [-]
  - brucehoult 12 days ago
    True, but the entire "Machine-Level CSRs" section is under 30 pages, and the parts you need to understand what happens when an exception happens is basically parts of the `mstatus` description specifically "3.1.6.1 Privilege and Global Interrupt-Enable Stack in mstatus register" (one page), the section on `MPRV` (less than a page), the sections on `mtvec`, `medeleg`, `mideleg`, `mip`, `mie` (five contiguous pages), the sections on `mscratch`, `mepc`, `mcause`, `mtval` (about five contiguous pages), and the sections on the `ecall, `ebreak`, `mret`, `sret` instructions (about a page), and the sections on what happens at RESET and NMIs (just over a page).
    So that's about 14 pages. This doesn't seem onerous to me, and counts as "in one place". It's about half of the entire M mode spec, which you might as well simply read in its entirety (two or three times), not try to cherry-pick.
    I mean ... you don't want a solid 14 pages of description with no section heading at all, do you?
    Maybe you want a single page that gives a 30,000 foot view. That would be possible, but it would inevitably have to leave out a lot and you'll need to read the detailed descriptions anyway.
- weebull 12 days ago
  > I honestly think the very readable specification has been a boon for RISC-V and possibly part of the reason why people continue to find it easy to pick up.
  Totally agree
  > If you are unsure about something in the spec, there's also a multitude of RISC-V emulators out there, probably several in your favorite language already.
  ...but this is a problem. This means that it's not the specification saying what the standard is, but an implementation. People copying different implementations get different behaviours.
  [-]
  - childintime 2 days ago
    Am I the only one that thinks an implementation is actually the better specification, if readable? What better way to describe state mutations than actual code, provided the language used is well-specified?
    Readability is key here, as the opposite, a black box emulator, would not provide the same value.
    > People copying different implementations get different behaviours
    People copying black box behavior get different behaviors, necessarily.
    Using English results in palpability, not specification. Major distinction. Indeed specs are intended to make an hitherto unseen vision palpable. But it's better to provide a mold, and verify implementations fit the mold.