The goal of this DIY project is to build a bandoneaon MIDI keyboard with velocity.
2024/01/18
Microcontroler development board are a huge time saver. Adafruit has a serie of small form factor called Feathers
These boards can be powered via USB or battery and utilize a voltage regulator to maintain a 3.3V operation.
Adafruit Feather M4 Express is built around the ATSAMD51J19A. It look like excellent choice because:
- it has plenty (21) input-output pins
- it runs at high clock speed of 120 MHz which will make possible bitbang many analog-to-digital SPI busses at 3.6 MHz. I also have 32 floating point hardware.
- it has a single core which keeps the temptation to write concurent code away
- it does not have any extraneous wireless features.
The first thing to do it to check the bootloader version.
The bootloader, the initial program running on the processor, resides at the
start of the embedded flash, mapped at address 0x0
(See
9.2 Physical Memory Map
and 25.6.2 Memory Organization
). Typically compact,
its role is to load and initiate the main program. In microcontrollers, it also
facilitates updating the main program.
Adafruit employs a fork of Microsoft UF2 bootloader, which implements the Mass Storage Class. This means the board appears as a drive when connected via USB. The term 'UF2' refers to a file format containing instructions for rewriting device memory. The file format specifies a series of memory addresses and the content to be written there. By following this process, the program can be replaced by overwriting the memory, a straightforward method requiring no extra hardware.
While both CircuitPython and Arduino workflows are well-documented, I will avoid using CircuitPython to prevent the garbage collector from introducing jitter into my time series sampling.
Thanks to Adafruit and Arduino's excellent work, the integration of drivers and boards in the IDE is flawless. It took two attempts to flash the device, but now the board's LED is blinking. Hello world!
2024/01/19
Musical Instrument Digital Interface, better known as MIDI, is a communication protocol used for transmitting musical notes over a serial interface. These notes are received by a synthesizer, which then produces the corresponding waveforms.
There are two hardware solutions: 5 pin DIN connector or USB Midi.
The electrical specifications for the 5-pin DIN MIDI, reveal that it utilizes a UART serial bus along with power. Fortunately, most modern microcontrollers now support UART.
Out of the five wires, only four are used: these are for Ground (GND), 3.3 volts (3V3), UART receive (RX), and UART transmit (TX).
UART (Universal Asynchronous Receiver-Transmitter) is an efficient serial communication protocol that uses just two wires: TX (transmit) and RX (receive). Without a dedicated clock signal, devices synchronize based on the transmitted signal itself. To ensure reliable communication, parity bits are also transmitted.
The Arduino Midi library works instantly.
I used Adafruit's Midi Featherwing for prototyping and it works like a charm.
In this setup, MIDI information travels through the same USB cable that is used for power, programming, and retrieving logs from the board.
To enable MIDI via USB, you need to use
TinyUSB, which can be selected
from the Tools > USB
Stack menu in the Arduino IDE
. By default, the USB
driver that is built with the program is ArduinoUSB. TinyUSB allows for the
support of additional protocols and enables the device to function as a MIDI
interface.
The boards shows up as Feather M4 Express
in my Digital Audio Workstation and
output midi notes.
Even with this setup, serial logging still works just fine.
2024/01/19
Analog to Digital Converters, also known as ADCs, measure voltage from a pin and convert it to numerical data for transmission over a serial bus. Those components usualy have multiple input channels connected to a unique Successive-approximation ADC by a MUX.
I selected the MCP3008 which looks popular. There is a tutorial available on Adafruit, and the midi_hammer project has successfully integrated it with hall sensors to create a keyboard (demo).
The MCP3008 can measure voltage from one of its eight inputs and then transmit the result via an SPI (Serial Peripheral Interface) serial bus. Unlike UART, SPI includes an explicit clock signal, which means it uses three wires, named:
- Clock
- MOSI (Master output slave input)
- MISO (Master input slave output)
A fourth pin CS
(Chip Select), is used to activate or deactivate the
peripheral, effectively controlling when it can send or receive data.
The tutorial features Python code, but luckily, Adafruit also offers an Arduino library, complete with examples. All I need to do is set the number for the CS pin, and it works seamlessly.
adc.begin(13);
Again, this is a huge time-saver.
I can read the 8 channels:
522 0 1023 1 1 0 0 0 [150]
522 0 1023 0 0 0 0 0 [151]
522 0 1023 2 1 0 0 1 [152]
522 0 1023 4 3 3 4 8 [153]
- The first channel shows the sensor output. In the absence of magnetic fields, it returns a mid-range value. Since the MCP3008 is a 10-bit converter, the maximum value is 1023, making 522 close to half of this.
- The second is grounded, and as expected, it reads 0.
- The third channel is connected to 3V3 and shows the maximum reading of 1023.
- When a magnet touches the sensor, the readings vary significantly: I can get a value of 788 and, interestingly, 254 when the magnet is rotated 180°.
Interestingly, the
Adafruit_MCP3008
library is
built on top of Adafruit_BusIO
which abstracts SPI communication. This is particularly useful because it can
implement a software bus, exactly what I need for reading from multiple
MCP3008
units.
The library source code demonstrates how to modify multiple outputs of a given port by writing to the register:
BusIO_PortReg *mosiPort = (BusIO_PortReg *)portOutputRegister(digitalPinToPort(mosipin));
BusIO_PortMask *mosiPinMask = digitalPinToBitMask(mosipin);
// Write
*mosiPort = *mosiPort | mosiPinMask;
This technique will be useful for reading measurements from multiple devices simultaneously.
However, the approach to frequency control appears quite basic. It simply sleeps according to the frequency period, without accounting for the time taken by operations.
int bitdelay_us = (1000000 / _freq) / 2;
delayMicroseconds(bitdelay_us);
I configured the library to use bit banging for the bus by specifying each pin:
adc.begin(10, 12, 11, 13);
Remarkably, this setup worked perfectly on the first try.
It runs at 65300 sps
(samples per second), which is half of the specified
throughtput at 3.3 V.
2024/01/21
The source code below implements software SPI. It writes the content of wbuf
to the bus while simultaneously reading into rbuf
. This approach is inspired
by Adafruit's implementation.
template <unsigned int L>
void transfer(const std::array<uint8_t, L> &wbuf,
std::array<uint8_t, L> &rbuf) {
// Only MSB first is supported.
int r_idx = 0;
for (const uint8_t wbyte : wbuf) {
uint8_t rbyte;
for (uint8_t bit = 0x80; bit != 0; bit >>= 1) {
digitalWrite(mosi_, (wbyte & bit) != 0);
digitalWrite(clk_, HIGH);
delayMicroseconds(clk_period_us_);
rbyte = (rbyte << 1) | digitalRead(miso_[i]);
digitalWrite(clk_, LOW);
delayMicroseconds(clk_period_us_);
}
rbuf[r_idx++] = rbyte[i];
}
}
The ADC protocol uses 17 bus clocks cycle to get a measure.
Bits | MOSI | MISO | |
Start bit | 1 | “1” | don’t care |
Single / Diff | 1 | “1” | don’t care |
Channel select | 3 | D0, D1, D2 | don’t care |
Sample Period | 1 | don’t care | don’t care |
Null bit | 1 | don’t care | “0” |
Sample | 10 | don’t care | data |
Your analysis of the code's performance and potential improvements is insightful. Here's a slightly restructured version for enhanced clarity and conciseness:
This code currently runs at `11.48 ksps`` (kilosamples per second), which translates to 87 μs (microseconds) per sample. However, there's room for improvement. According to the MCP3008 datasheet, it can achieve up to 130 ksps at 3.3V. Here are some potential optimization strategies:
- The most accurate delay function available is delayMicroseconds, but it limits the SPI bus to a maximum of 500 KHz. To reach the desired 120 ksps, a bus speed of 2.16 MHz is required.
- The current codes transmits 3 bytes (24 bit) per sample, wheras the minimal is 17 bits, which is 30% more than needed.
- The existing delays don't account for the latency of operations between the delays, which adds to the total delay time.
- Using port registers to control output instead of digitalWrite could be more efficient.
- Adafruit has implemented a fast path for cases where the MOSI pin sends
consecutive identical values. This bypasses
digitalWrite(mosi_, (wbyte & bit) != 0)
. Surprisingly, this optimization increases the rate to 12.65 ksps (79 μs per sample), an improvement of 10%, which makes me thinkdigitalWrite
is very slow.
2023/01/21
ATSAMD51J19A
Arduino API implementation by Adafruit
utilizes a hardware cycle counter CYCCNT
of the Data Watchpoint and Trace
(DWT). This method is extremely precise, as the processor clock operates at 120
MHz. It's important to note that the processor clock frequency can be altered,
and the current frequency is accessible through SystemCoreCloc
k.
void delayCycles(uint32_t count) {
// This value takes into account the time spent in the function
// itself. It has been determined experimentally by comparing the
// delayCycles() function with the micro().
constexpr uint32_t experimental_bias = 16;
const uint32_t start = DWT->CYCCNT - experimental_bias;
// The DWT->CYCCNT register is a 32 bits counter that counts the
// number of cycles since the last reset. It is incremented every
// cycle. It wraps around every 2^32 cycles (~37 secs at 120MHz).
while (DWT->CYCCNT - start < count) {
}
}
By replacing delayMicros()
with delayCycles()
, the sample rate has been
significantly improved to 23.7 ksps, which is an increase of +87%.
It seems that writing values to the port is time consuming. In a benchmark test using the following code:
uint32_t start = micros();
for (int i = 0; i < 1000000; i++) {
digitalWrite(12, HIGH);
digitalWrite(13, HIGH);
digitalWrite(12, LOW);
digitalRead(13);
}
uint32_t end = micros();
// Display the cycle frequency.
Serial.print("Cycle frequency: ");
Serial.print(1000000.0 / (end - start));
Serial.println("MHz");
The results indicate a cycle frequency of 0.70 MHz when compiled with the
-Ofast
optimization and 0.61 MHz with -Os
(optimize for size). These
performance levels are insufficient for driving the bus at the desired speed of
2.16 MHz. However, controlling the GPIO directly via the port registers
demonstrates a much faster performance.
Direct control of the GPIO via port registers significantly boosts performance. Consider this code snippet:
// Get port and mask from pin number.
auto reg = portOutputRegister(digitalPinToPort(12));
auto mask = digitalPinToBitMask(12);
uint32_t start = micros();
for (int i = 0; i < 1000000; i++) {
*reg |= mask;
}
Running this code yields a frequency of 13.31 MHz when compiled with `-Ofast``. As expected, a lower frequency of 3.64 MHz is observed for four register updates.
By replacing digital*
functions with direct pad manipulation, the sample rate
can be increased to an impressive 90.1 ksps. However, there's still room for
further improvement to reach our target.
2024/01/22
The operation *ioreg |= mask;
involves reading, modifying, and then writing
back the result, which can be time-consuming. To address this, the
ATSAMD51J19A
features specialized registers that allow for setting or clearing pins directly
by writing a bitmask to them. This functionality significantly streamlines the
process.
To find the specific member names required, I had to delve into the code. On
Windows, this can be located in the following path:
\Users\$USER\AppData\Local\Arduino15\packages\adafruit\tools\CMSIS-Atmel\1.2.2\CMSIS\Device\ATMEL\samd51\include\instance\port.h
.
This file is part of the CMSIS (Cortex Microcontroller Software Interface
Standard) package provided by Adafruit for the SAMD51 microcontroller.
For reference, in terms of performance:
- The function call digitalWrite(13, HIGH) consumes 41 cycles.
- Direct register manipulation with *ioreg |= 0x800000 takes only 10 cycles.
Here is a verbose way to access OUTSET
.
digitalPinToPort(13)->OUTSET.reg = digitalPinToBitMask(13);
This operation runs in 8 cycles. While it's slower compared to direct register manipulation, this is mainly because the digital* functions take additional time to execute. However, these functions return a constant value, which offers the possibility of caching for improved efficiency.
auto port = digitalPinToPort(13); // Once for all.
port->OUTSET.reg = 0x800000;
This method is significantly faster, completing the operation in just 3 cycles.
REG_PORT_OUTSET0 = 0x800000
This is the most straight forward, and it runs in 3 cycles, which is 3 times
faster than writing to the IO Register an 13 times faster than the
digitalWrite
! . With this efficient method of controlling IO, the SPI bitbang
speed now reaches 105.96 ksps, which is quite close to the MCP3008 ADC's
specified limit of 120 ksps at 3.3V. However, we have yet to achieve the maximum
possible speed.
I should be able to squeze by rewriting the throttler. Without throttler the bus runs at 186.96 ksps, and looks to return correct values.
Profiler code can be found here
2024/01/23
For the clock
or CS
, the code involves either clearing or setting a bit. In
contrast, the MOSI pin requires setting it to correspond with a specific bit of
the data. Here are several methods for handling the MOSI pin, by increasing
efficiency:
*(mosi_value ? ®_PORT_OUTSET0 : ®_PORT_OUTCLR0) = mosi_mask_;
// 6 cycles
REG_PORT_OUT0 = (REG_PORT_OUT0 & ~mosi_mask_) | (mosi_value << mosi_shift_);
// 15 cycles
*(®_PORT_OUTCLR0 + mosi_value) = mosi_mask
// 3 cycles
With this new method, we're beginning to observe that the code needs to be
throttled. Otherwise, the bus operates at a speed that exceeds the capabilities
of the MCP3008
.
For the record I also considered
bit-banding,
but the featue is not available on the
ATSAMD51J19A
.
Target is 130 KSPS = 1/130e3/18 = 4.27350427E-7 = 427 ns
Up until now, we've been handling time control with a simple busy loop. The catch is that this method isn't super precise because each cycle includes multiple instructions, and it can only wait for multiples of this cycle duration.
To make things more accurate, we started using the __NOP();
instruction. It's
like a little pause button for the processor, letting it wait for just one
cycle. But here's the kicker: getting the timing spot on with this method means
doing some manual tuning. You have to measure the timing precisely, which can be
a bit of a hassle.
One way to do this measurement is by keeping an eye on DWT->CYCCNT
. But
remember, this approach can be a bit intrusive.
I ordered
AZ Delivey Logic Analyzer
.
It is an innexpensive device able to capture the binary state of 8 channels at
24 MHz, the data is then analyzed with
Salae Logic.
That is a delight to use, the software decodes and displays the SPI data as hexadecimal, and display precise cycle timings for the clock.
After adding a few __NOP();
instructions and dividing the unpacking of MOSI
into two parts, one before and one after clearing the CLK
:
The target is 130 KSPS = 1/130e3/18 = 4.27350427E-7 = 427 ns
is reached. Yay!
I was able to identify the compiler used by enabeling verbose compiler output
in Arduino IDE
preferences:
$ Arduino15\packages\adafruit\tools\arm-none-eabi-gcc\9-2019q4/bin/arm-none-eabi-g++ --version
arm-none-eabi-g++ (GNU Tools for Arm Embedded Processors 9-2019-q4-major) 9.2.1 20191025 (release) [ARM/arm-9-branch revision 277599]
Looking at https://godbolt.org/ with compiler ARM gcc 9.2.1 (none)
with
-Ofast
.
Here is a decompiled code sample.
We can see there that:
SPIArray::transfer
andSPIArray::select
inlined.- The loops that unpack the measures are unrolled:
for (; i < N / 2; i++) { const bool bit = (in >> miso_shifts_[i]) & 0x1; rbuf[i] = (rbuf[i] << 1) | bit; }
It is tempting to make all the pin informations such as masks and shift part of the template argument to turn load instructions into immediates, but oveall the code is well optimized and run fast enough.
2024/01/28
I recorded the following samples at 2 KHz.
We can see that when it is pressed vigorously, the switch travels in 3.8 ms.
Hence it is nice to be able to sustain a sample rate
Notes are considered pressed and released on a different threshold to avoid
repeated triggering. The velocity in
When pressed slowly, the instant derivate
-
With
$q = 1$ this is the instant derivate. -
With
$q < 1$ it takes into account the past samples by anoly accessing the previous sample and the previous derivate.
2024/01/31
I plan to use Wooting 60 gf Hall keyboard switches, with 3d printed caps.
The ADCs use a capacitor to sample the voltage and employ a dichotomy method to test various voltages in order to determine the measurement value. The input is characterised by a resistance and a capacity. The source impedence should be low enough for the capacity to charge during the sampling period.
MCP3008
input impedance 1 kΩ, capacity 20pf.ATSAMD51J19A
input impedance 2kΩ, capacity 3pf
Texas instruments provides a technical document on ADC Source Impedance.
It shows what the input in voltage would look like if the source impedence is to high.
I printed caps with inner diameter of
As second attempt to print a batch of 12 caps with inner diameter of
- When printed in batch the seams are not very clean. Stop using batch or ensure
cura
print them one at a time
which will require the part to be smaller than the gantry height (
$60mm$ ) and to set the number of extruders to 1 in the printers's settings. I should be able to print 9 caps per batch. - This time the adjustement with the switch is too lose. This is confirmed by
measurement with a caliper, the outer diameter of
$5.6mm$ , which is$.1mm$ larger than the previous print with the same inner inner. I should reduce the inner diameter to$4.1mm$ . - Assuming I stick to the .4 header, I should try a thicker sleeve by increasing
the outer diameter from
$5.6mm$ to$4.1+.04 * 2 * 2 = 5.7mm$ - The sleve is slightly too short and should be extended by
$.1mm$ - It is unconvenient to remove the brims, and I should switch to skirts.
- The triangular infill looks like a radioactive symbol «☢» and Line, Concentric or Gyroid should be prefered.
- Once assembled the caps elevates at
$11mm$ , versus$9mm$ to$10mm$ for the original instrument.
The new button has a long sleeve of
- Caps:
- The new caps are to tight. Try an inner diameter of
$4.15mm$ - They are also too tall. Set the total height to
$11mm$
- The new caps are to tight. Try an inner diameter of
- Joystick:
- It is well adjusted but offcentered bt 1mm South-East
- It is 1cm too far from the handle
- Touches holes
- The switches are still moving because the carter is a bit flexible. I need to add a stifener web to the top.
- When applying pressure holes above
$⌀7.5$ are large engough to avoid contact. Every hole below$⌀7$ and uncomfortable.$⌀8$ should be a good compromise.
I considered to use a resistive screen (Datasheet to send MIDI control signals. This works well but takes too much space and the intrgration with the 3D printed case would require to much work. I decided to remove it.
- Article source
Pins are X- Y+ X+ Y- https://www.adafruit.com/product/3575
You can view the kicad board on Kicanvas online viewer.
MCP3008-I/SL jclpcb
The enclosure design can be seen here.
I printed the model with PLA on Prusa mk4, and inserted M3 bolts.
I also printed the PCB model to validate the component footprints and the assebly:
Assembly is done with 23 x M3 screws with flat heads, and brass bolts:
I added pretection to the two jacks inputs, the potentiometer and the USB voltage input. The TVS diodes uni / bidirectional naming was confusing until I read this document.
I selected SRV05-4 packages.
Expression pedals wiring is not standatd. I measured the resistance between the tip rig and sleeve on the innexpensive M-Audio EX-P. This pedal has a configuration switch and a potentiometer.
-
Switch mode set to "M-AUDIO":
- Potentiometer set to min: R<->S [11.1kΩ, 11.1kΩ], T<->S: [0.9kΩ, 12.1kΩ], R<->T: [12.1kΩ, 0.9kΩ]
- Potentiometer set to max: R<->S [63.3kΩ, 63.3kΩ], T<->S: [52.14kΩ, 62.8kΩ], R<->T: [12.1kΩ, 0.9kΩ]
-
Switch set to "Other":
- Potentiometer set to min: T<->S [11.1kΩ, 11.1kΩ], R<->S: [0.9kΩ, 12.1kΩ], R<->T: [12.1kΩ, 0.9kΩ]
- Potentiometer set to max: T<->S [63.3kΩ, 63.3kΩ], R<->S: [52.14kΩ, 62.8kΩ], R<->T: [12.1kΩ, 0.9kΩ]
The behavior between the tip and the ring is not affected bu the potentiometer and the switch, which is convenient to avoid missuses.
I ordered 2 assembled PCB on JCLPCB. Exporting the production files was
reasonably easy using the tutorials. I had to add the LCSC Part
numbers to the
schematic and to export the bom with custom scripts.
The the pick and place file, JCLPCB assumes that 0 degree is the posiition of the part on a reel tape with the punctures on the left. This was not the case for my footprint. I wrote a script to postprocess the components orientation.
- Use pcb mounted potentimeter: simpler to assemble and to test.
- Use pcb mounted tactile button for the tow unused pins
- Have the sensonrs and ADC run with 5V for better sensitivity of the hall sersors.
- Inside a bandoneon video 1, 2
- Bandoneón MIDI video, site
- bandominedoni video, github
- Bandonberry github
- JS Application to learn the bandoneon layout https://github.com/nicokaiser/bandoneon
- Hall effect keyboard listing
- Lekker keyboard [design note](https:// .io/post/validation-tests-lekker-update-8) and teardown video.