Reverse-Engineering Wireless Kinetic Switches

Posted on January 17, 2023 by greg

The other day I found some Wireless light switches on Amazon. While wireless (light) switches are not exactly novel, the article description said the following:

The reasons for choosing this radio switch set. No power cable to lay when you install a wireless light switch, makes your house cleaner. No power supply, no batteries required for the radio switch, therefore safer for children and pets.

This was a little surprising: they don’t need batteries? What type of sorcery is that? The basic idea behind the electronics side has already been analyzed, and it’s quite interesting. But I was rather thinking about the radio transmission itself.

I figured such switches would be a perfect fit for my home automation system. So ideally, I’d like to have a custom receiver for them, which I could then integrate with the rest of my system. However, I couldn’t find very much info on the wireless communication protocol they used. The idea of doing some radio protocol reverse-engineering sounded quite fun, so I thought “oh well.. how hard can it be?” and decided to just buy a couple to play around.

While waiting for the hardware to arrive, I read up a little on common 433MHz protocols, and how to reverse them. There are quite a number of articles online, which describe the overall process. As a very first step, I wanted to set up something that would allow me to capture the radio signals. This directly lead to the question of how the signals are modulated.

Modulation

There are various ways to convert between digital data and analog signals. When it comes to radio waves, common schemes include amplitude-shift keying (ASK), frequency-shift keying (FSK), phase-shift keying (PSK) and various combinations of those. This is what’s called modulation.

The idea behind ASK is quite simple: you take a carrier wave (433MHz in our case), and whenever you want to send a one, you emit this wave with a high amplitude. When you want to send a zero, you pick a low amplitude (or vice versa). This is called on-off keying, as you basically just send bits by turning on and off a radio wave. This can obviously be extended, so that one can send multiple bits at a time. For instance, if one would like to send two bits at a time, one could alternate between four different amplitudes in the signal.

The FSK approach in turn does not use the wave’s amplitude, but the wave’s frequency to encode data. Here, one would use waves of two different frequencies to encode zero and one bits. For instance, a wave with a higher frequency could indicate a one, whereas a lower frequency could indicate a zero. This simple binary FSK can - just as ASK - easily be extended into a scheme that would use four different frequencies to transport more than one bit at a time.

Please keep in mind that this write-up is by no means intended to serve as an in-depth explanation of radio modulation schemes; so please make sure to check other sources on this as well if you’d like to obtain an actual understanding of what’s going on.

Having looked at how ASK and FSK generally work, there was one immediate observation: regardless of the used modulation scheme, one would need to obtain some information in order to be able to correctly decode the respective signal. For ASK we’d have to know the used frequency (which in our case should be somewhere around 433MHz), the different amplitude levels (which should be easy to see if we could somehow capture the radio waves) and the length of each bit. We need to keep in mind that we don’t have a clock line between the sender and receiver, and hence it is not immediately clear where one bit ends, and another bit starts. For instance, if the sender sends a high amplitude for 100µs, shall that mean a single one bit? Two? One hundred?

Similarly, for FSK, we need to know the different frequencies that are being used as well as the bit length as outlined above.

First Attempt

So, how to figure out the used modulation scheme and the required parameters? I could have just bought a software-defined radio, but I thought that might be overkill. Instead, I just assumed that they would probably use something simple. Maybe ASK? That would be pretty easy to demodulate and once I’d be able to see the signal I could sure figure out the right parameters.

So I went ahead, bought some cheap 433MHz ASK receiver modules and connected them to my sound card using a simple voltage divider circuit. The basic idea behind this is to use the sound card’s ADC to scan the radio receiver’s 5V output signal. This approach is also well-known and documented on various websites.

I wired up everything and was excited to see that I was in fact able to capture the signal using a simple audio recorder. I felt I was one step closer to understanding the protocol! My excitement lasted for about 10 seconds, until I realized that while I did capture something, the result did in fact not look very useful.

What I basically obtained was just a simple square wave pattern consisting of some high-amplitude regions. I highly doubted that this pattern could contain enough information to distinguish between different switches. I’d rather have expected to see way more one and zero bits, at least under the assumption that they’d at least have to send some sort of ID for each switch and maybe also a CRC or some sort of redundancy for error detection.

So maybe it wasn’t ASK after all? Or maybe my sampling rate was way too low so that I just wouldn’t see the difference between one and zero bits? Or was it something completely different? The problem with the approach so far was that I just assumed they’d use ASK, and the test setup would simply not produce any meaningful results if another modulation scheme was used. Also, the setup was limited by my hardware choice; simply buying a cheap generic 433MHz receiver and hoping that it would pick up the signal was maybe a bit too optimistic.

SDR

So I decided to take a step back and to reduce the amount of guesswork by following a more structured approach. If I could somehow directly capture the radio waves in the frequency range used by the switch, I might be able to see a little more. So I decided to try using a software-defined radio. I did a little research and found that an RTL-SDR would maybe be good for getting started:

RTL-SDR is a very cheap ~$30 USB dongle that can be used as a computer based radio scanner for receiving live radio signals in your area (no internet required). Depending on the particular model it could receive frequencies from 500 kHz up to 1.75 GHz. Most software for the RTL-SDR is also community developed, and provided free of charge. Note that RTL-SDRs cannot transmit.

That sounded great for my purposes. So while I waited for the hardware to arrive, I looked for some software that I could use. Someone pointed me to Universal Radio Hacker, which looked quite simple and beginner-friendly. So I decided to give it a try.

Second Attempt

After the hardware arrived, I set up everything, installed the required drivers, URH and also set up Spektrum. The idea was to first use Spektrum in order to check the actual frequency that the light switch used. If there were multiple frequencies (e.g., for FSK), then I’d be directly able to see this as well.

So I tuned Spektrum to a range from 432MHz to 434MHz and lo and behold! I actually obtained a meaningful result!

One can observe at least three different frequencies being used here. The one at the center is at 433.3MHz, and then there’s two more at about 433.3MHz + 50KHz and 433.3MHz - 50KHz, as depicted below.

This looked suspiciously like FSK. So I went ahead and fired up URH in order to see if I could actually capture a meaningful signal. I tuned to 433.3MHz plus/minus 100KHz and started capturing. Indeed, the result was already a lot better and looked like the one below.

This gave me some confidence that I was on the right track. A sensible next step was to figure out the exact modulation parameters. Luckily, URH offered to auto-detect them. Indeed it recognized the FSK encoding and gave me the following decoded signal.

One can also observe that there seem to be two transmissions; one when pressing the switch and one when releasing it. Zooming in on the first transmission yielded the following result:

There seem to be four individual parts, the last of which with a much lower amplitude than the other ones. This reminded me of the square wave pattern I got when trying to capture the signal using the sound card. Maybe the switch just repeatedly sends its data until it runs out of electric energy? That is at least a hypothesis we could easily check, so let’s keep it in mind for now. Further zooming in resulted in this wave pattern:

Just by looking at the wave form of the signal, the bit length turned about to be about 10µs. In other words, that would be about 100000 bits per second. This is also what URH guessed initially. Rounding this down to a more common number like 96000 bits per second however seems like a sensible idea.

What we have observed so far:

The radio transmission happens using FSK, with a center frequency of 433.3MHz and a deviation of plus/minus 50KHz.
The transmission likely happens at about 96kbit/s.
The switch performs two transmissions, one when pressing it and one when releasing it.
Each of these transmissions appears to contain multiple parts; possibly the switch just repeats its radio transmissions until it runs out of energy.

This made a lot more sense than my previous attempt. So I decided to now move on to actually trying to figure out what the bit pattern I saw could really mean.

Initial Reversing

Switching the “Signal View” option in URH to “Demodulated” showed how URH did the FSK demodulation. Maybe a good first step would be to focus on the first supposed repetition of the “press switch” transmission.

Below the signal, the recognized bit pattern is shown. It reads like this:

1111110101010101010101010101010101010101010100001000011010010000100011000001
10010010110000010000110010000111100100111100101011111110

But what is that supposed to mean?

Network protocols often entail a bit more than only sending payload bits from one end to the other. For being able to receive a message, in a first step the receiver has to know that something is being sent. This might sound more trivial than it actually is, particularly considering the fact that wireless communication channels typically have quite an amount of random noise. How can a receiver distinguish between a signal (maybe, a low-quality signal) and just random noise? One typical way to achieve this is to make use of so-called preambles and/or sync words. The idea behind this is, that before sending actual data, a known pattern is sent, which can be recognized by the receiver.

Looking at the bit pattern that was picked up by URH, there is indeed a repeating sequence of one and zero bits right at the beginning of a transmission. This makes sense, as it can for instance be used by the receiver to align its internal timing with the sender. The leading one bits are likely artifacts of the sender powering up its radio. So we’ll just ignore these.

For actually decoding the data we would first strip away the supposed preamble of alternating zero and one bits. But how to interpret the rest of the data? It was now time to investigate the observed bit patterns a bit more thoroughly.

So the two transmissions for press/release could sure be interesting. Maybe there is a bit in the transmitted data, which indicates whether the switch was pressed or released? Also, as I had bought two switches, I figured it would probably be helpful to compare the data sent by the different switches as well. This would maybe help recovering the structure of the data transmitted.

Performing some more captures yielded the following bit patterns (the first couple of one bits have been stripped already):

Pressing switch one:  10101010101010101010101010101010101010100001000011010010000100011000001
                      10010010110000010000110010000111100100111100101011111110
Releasing switch one: 10101010101010101010101010101010101010100001000011010010000100011000001
                      100100101111000000101010111101011001001111001010111110
Pressing switch two:  10101010101010101010101010101010101010100001000011010010000100011001010
                      1010000001000001000011100000100110010011110010101111111
Releasing switch two: 10101010101010101010101010101010101010100001000011010010000100011001010
                      1010000001110000001010000111101110010011110010101111111

Again, let’s summarize what we learned:

Each “packet” starts with a preamble of alternating one and zero bits.
After the preamble, all packets seem to contain some more bits of static data. At least for the rather small sample set of two switches.
Pressing/releasing a switch indeed yields different signals.

Building a Custom Receiver

After having repeated the above analysis steps a couple of times, I eventually realized that it might not be the worst idea to start working on a custom receiver. This would on the one hand allow me to check whether the results so far can be reproduced on other hardware (and not only on my SDR). But also I got a bit tired of repeatedly analyzing captures in URH and I figured maybe starting to build a receiver would be a good change.

At this point, it was pretty clear that I wanted to have something that was able to demodulate FSK signals, and that the cheap receivers I bought previously were of no good use. So this time I decided to be a bit more careful. I initially wanted to just buy a CC1101, which I already had a bit of experience with. The idea was to just connect it to an ESP8266 I still had somewhere in my stash, and to implement the packet decoding logic I had figured out so far on the microcontroller. However, I wasn’t able to find a CC1101 board usable for 433MHz (I did find a couple of 868MHz ones, though), and so I decided to just buy a HopeRF RFM69HW. I wired it up to my ESP, trying my best not to set the lab on fire while soldering it.

In order to get a working prototype without having to fiddle with the low-level details, I decided to just use the Arduino IDE together with the RadioLib library, which already contains support for the radio chip. So I hacked together a quick PoC, based on the available examples:

void setupRadio() {
  // In URH, we see that one symbol is about 10us; in other words, this gives us ~100k symbols per second.
  // We _might_ conjecture that it's rather 96k (i.e., 9.6k * 10).
  // From the spectrum analyzer, we can see the possible FSK frequencies involved;
  // the center is clearly at 433.3MHz, and there are peaks at +-50kHz.
  // So let's give this a try.

  // carrier frequency:                   433.3 MHz
  // bit rate:                            96kbps
  // frequency deviation:                 50kHz
  // Rx bandwidth:                        100kHz
  // output power:                        13 dBm (not that it matters here)
  // preamble length:                     16 bits
  int state = radio1.begin(433.3, 96, 50, 100.0, 13, 16);
  radio1.setCrcFiltering(false);
  radio1.fixedPacketLengthMode(16);
  uint8_t syncWord[] = {0x00};

  if (radio1.setSyncWord(syncWord, 0) == RADIOLIB_ERR_INVALID_SYNC_WORD) {
    Serial.println(F("[RF69] Selected sync word is invalid for this module!"));
    while (true);
  }

  if (state == RADIOLIB_ERR_NONE) {
    Serial.println(F("success!"));
  } else {
    Serial.print(F("failed, code "));
    Serial.println(state);
    while (true);
  }
  radio1.setDio0Action(handleRfPacket);
  radio1.startReceive();

The code above first initializes the radio using the parameters obtained from the spectrum analyzer and from URH. It then uses the radio’s interrupt-based receiving functionality, so that whenever the radio saw a packet it would raise an interrupt that would trigger the handleRfPacket function. Some parameters, like the preamble length or the overall total packet length have been set to rather arbitrary values (16 bits preamble length and 16 bytes total packet length). While these values could have been obtained from the URH analysis results, I figured that I’d probably have to fine-tune them anyway and so I just used something that was within the expected range. Regarding the packet length, it should be noted that RadioLib offers some higher-level logic for receiving dynamically-sized packets containing a length field. This was most likely not the case here, hence the fixedPacketLengthMode() call. Also, CRC checking has been disabled because I didn’t really know if there was a CRC involved. Lastly, a word about the sync word. After the preamble, (radio) packets can contain a hard-coded sync word. This value can be used to tell apart actual packets from random noise on the channel. The packet captures in URH clearly showed that there might be a sync word in place, as all packets seemed to start with the same bit sequence. However, I just wanted to see some packet captures first, and then gradually adjust all the parameters. So I decided to just set a zero-length sync word for now.

The first implementation of handleRfPacket simply printed out the packet data obtained via radio1.readData() on the serial console. This approach “worked” in the sense that it did print something on the serial console. But the output was a total mess. This was not totally unexpected, because - as described above - quite some parameters were probably not correct. The radio picked up quite an amount of random noise, so the next step was to try isolating actual packets from background noise.

One obvious way for doing so would be setting a sync word. However, this is a little less trivial than one might think. Where does the preamble actually end, and where does the sync word start? Also, how long is the sync word?

More Protocol Analysis

As already outlined, there are multiple options for determining possible sync words. As a first step, I figured I’d maybe just take a guess and see where it leads to.

Maybe one could just compare the “switch pressed” bit patterns of the first and the second switch, and split the bit patterns where they start to differ:

Pressing switch one:  1010101010101010101010101010101010101010000100001101001000010001100  0001100100101...
Pressing switch two:  1010101010101010101010101010101010101010000100001101001000010001100  1010101000000...
                      |______________________________________||_________________________|
                                     preamble?                         sync word?

So, assuming a preamble of 1010101010101010101010101010101010101010 this would give a possible candidate for a sync word, namely 000100001101001000010001100. So the preamble would be 40 bits, and the sync word 27 bits. Now, 27 bits isn’t really a nice number. Maybe it’s rather 24 bits? In hex, these 24 bits would be: 0x10 0xd2 0x11. The actual payload size would then be about 60 bits or so. I opted to just go with 7 bytes of packet size. So I adjusted the receiver code as shown below.

  radio1.fixedPacketLengthMode(7);
  uint8_t syncWord[] = {0x10, 0xd2, 0x11};

  if (radio1.setSyncWord(syncWord, sizeof(syncWord)) == RADIOLIB_ERR_INVALID_SYNC_WORD) {
    Serial.println(F("[RF69] Selected sync word is invalid for this module!"));
    while (true);
  }

And wow! It actually seemed to work! I didn’t receive any background noise anymore, and whenever I pressed/released one of the switches, I actually received multiple repetitions of the payload data (albeit sometimes containing some bit errors).

But was that actually correct? The bit errors were a bit troubling: how would an actual receiver deal with them? Sure, there are multiple repetitions of each packet, and so one could in principle do some error correction based on that. But somehow I doubted that this was what’s actually going on. Also, the bit patterns for pressing/releasing the same switch show quite a number of differences, but one could naively assume that they should only differ in one single bit. So where do these differences really come from?

I decided to dig a bit deeper. How would one determine whether the received data is actually correct? If only I knew what an actual packet should look like, so that I could compare that to the results so far…

But what should an actual packet look like? We already observed that the packets for different switches differ. This makes total sense, because the actual receiver needs to be able to tell different switches apart. So most likely, each packet would contain some sort of identifier for the respective switch. Could we somehow find out what such an identifier should look like? I decided to inspect the switches a bit more closely, and found the following.

Each switch had a sticker with a barcode and some text:

First switch: ES2K1202112160136 064B0F0002
Second switch: ES2K1202112170724 2A810F0002

The second part of each sequence suspiciously looks like a hex number. Maybe one could find at least parts of that number in the transmitted packets? The hex numbers seem to differ only in their first two bytes, so I decided I’d maybe try to see if these bytes occur somewhere in the radio transmissions. The value 064B would be 0000011001001011 in binary, and the value 2A81 would be 0010101010000001. And indeed! We can find these bit patterns in the captured radio transmissions:

                                                     064B: 0000011001001011
Pressing switch one:  preamble + 000010000110100100001000110000011001001011000001000011001...
                                                     2A81: 0010101010000001
Pressing switch two:  preamble + 000010000110100100001000110010101010000001000001000011100...

Ha! So now we can re-evaluate our assumptions regarding the preamble and the sync word! Maybe the actual packets rather look like this:

Pressing switch one:  10101010101010101010101010101010101010100001000011010010000100011 0000011001001011 ...
Pressing switch two:  10101010101010101010101010101010101010100001000011010010000100011 0010101010000001 ...
                      |_______________________________________||______________________| |______________|
                            preamble plus "pause"                 24 bits sync word        switch ID      ?

The assumption behind this is that the sync word should immediately precede the switch ID, and that it should have a bit length divisible by eight. A quick check reveals that the receiver is still able to capture packets after adjusting the sync word to 0x21 0xa4 0x23. And now it even prints the same switch ID that is shown on the backside of each switch. At least if there are no bit errors…

In order to dig yet deeper, let’s maybe inspect the difference between pressing and releasing a switch again:

Pressing switch one:  preamble + sync + 0000011001001011 00000100 00110010000111100100111100101011111110
Releasing switch one: preamble + sync + 0000011001001011 11000000 101010111101011001001111001010111110
Pressing switch two:  preamble + sync + 0010101010000001 00000100 0011100000100110010011110010101111111
Releasing switch two: preamble + sync + 0010101010000001 11000000 1010000111101110010011110010101111111
                                        |______________| |______|
                                            switch ID      p/r?             ?

Now that we likely identified the right packet structure, we can also observe that the byte immediately following the switch ID in the packet might indicate whether the switch was pressed or released. Why only one byte? The reasoning behind this assumption is that this byte seems to have the value 0x04 for pressing, and 0xc0 for releasing the switch. And these values are the same for both switches. The following bytes, however, seem to differ. So what are they used for?

It seems like all required information has already been identified: the packet starts with a preamble, followed by a sync word. Then comes the switch ID, and then a byte indicating whether the switch was pressed or released. So what could the other bytes be good for? One possibility is that there might be some sort checksum in place. This would make sense, but how could we find out if that’s really the case?

Identifying an unknown checksum, simply based on a small sample-set of data is not really promising. But then again, maybe the checksum was based on a well-known scheme? That would make things a bit easier. Of course there are many possible ways to compute checksums. And again, maybe using some trial and error wouldn’t hurt. An obvious choice for checksums is one of the countless CRC schemes. A good first step might be to figure out the size of the used checksum:

Pressing switch one:  preamble + sync + 0x06 0x4b + 0x04 + 0011001000011110 0100111100101011111110
Releasing switch one: preamble + sync + 0x06 0x4b + 0xc0 + 1010101111010110 01001111001010111110
Pressing switch two:  preamble + sync + 0x2a 0x81 + 0x04 + 0011100000100110 010011110010101111111
Releasing switch two: preamble + sync + 0x2a 0x81 + 0xc0 + 1010000111101110 010011110010101111111
                                        |_______|   |__|   |______________|
                                           ID        p/r       checksum?          ?

Closely inspecting the bits that come after the pressed/released byte shows something interesting: only the first 13 bits are actually changing. The following bits seem to be rather constant (except for maybe some errors at the end). But 13 bits is of course not really a round number. So could the checksum maybe be 16 bits in size?

Assuming the checksum is a CRC-16 variant, one can easily check on crccalc.com if the used CRC parameters correspond to a well-known scheme. So, entering 2a 81 04 as data and hitting the “CRC-16” button should produce a result of 0x38 0x26. And indeed it does! The “CRC-16/AUG-CCITT” scheme produces this result. Confirming with the other entries in the above table directly shows that indeed, CRC-16/AUG-CCITT is likely used here.

About the remaining bits: maybe they are some sort of footer? This might not be too important, because given that we found the checksum scheme, we now have everything in place to build a custom receiver.

Some Code

So, after all the analysis described above, I was finally able to receive packets using the below Arduino code snippet.

#include <RadioLib.h>
#include <string>
#include "crc16.h"

#define RF_PACKET_SIZE 5

// RF69 has the following connections:
// CS pin:    D8
// DIO0 pin:  D1
// RESET pin: D3
RF69 radio1 = new Module(D8, D1, D3);

void setupRadio() {
  Serial.print(F("[RF69] Initializing ... "));
  // carrier frequency:                   433.3 MHz
  // bit rate:                            96kbps
  // frequency deviation:                 50kHz
  // Rx bandwidth:                        100kHz
  // output power:                        13 dBm
  // preamble length:                     32 bits
  int state = radio1.begin(433.3, 96, 50, 100.0, 13, 32);
  radio1.setCrcFiltering(false);
  radio1.fixedPacketLengthMode(RF_PACKET_SIZE);

  if (state == RADIOLIB_ERR_NONE) {
    Serial.println(F("success!"));
  } else {
    Serial.print(F("failed, code "));
    Serial.println(state);
    while (true);
  }

  uint8_t syncWord[] = {0x21, 0xa4, 0x23};

  if (radio1.setSyncWord(syncWord, sizeof(syncWord)) == RADIOLIB_ERR_INVALID_SYNC_WORD) {
    Serial.println(F("[RF69] Selected sync word is invalid for this module!"));
    while (true);
  }
  radio1.setPreambleLength(32);

  // Set an interrupt handler for receiving packets
  radio1.setDio0Action(handleRfPacket);
  // Start receiving (and handling packets)
  radio1.startReceive();
}

bool radioPacketValid(const uint8_t* buf) {
  // The first three bytes of the packet are subject to CRC16/AUG-CCIT,
  // as determined by https://crccalc.com/. The CRC value is stored in
  // the two following bytes. A packet looks like this:
  //   2a 81 04     38 26     4f 2b
  //     DATA       CRC16     ?????
  // Therefore, if we CRC16 the first _five_ bytes, the expected result
  // is zero.
  return crc16(buf, 5) == 0;
}

// This will be registered as an ISR for receiving packets from the
// 433 MHz radio.
#if defined(ESP8266) || defined(ESP32)
  ICACHE_RAM_ATTR
#endif
void handleRfPacket() {
  uint8_t buf[RF_PACKET_SIZE];
  float rssi;

  int state = radio1.readData(buf, sizeof(buf));
  if (state == RADIOLIB_ERR_NONE) {
     rssi = radio1.getRSSI();
     if (rssi < -120) {
       return;
     }
  } else if (state == RADIOLIB_ERR_RX_TIMEOUT) {
    Serial.println("[RF69] Receive timeout!");
    return;
  } else {
    // some other error occurred
    Serial.printf("[RF69] Unknown error: %d\n", state);
    return;
  }

  if (!radioPacketValid(buf)) {
    return;
  }

  // Now we can do whatever we want with our packet
  // We just should make sure to call radio1.startReceive() again.
  ...
}

About Security…

It might be quite obvious, but after having reverse-engineered the protocol it seems like the data sent by the light switches can easily be captured/replayed by basically anyone who is within radio range. While this might not be a super critical issue given the use-case of the light switches, it is still good to know. Like, using such a switch to unlock your smart door lock or something is probably not the best idea ever.

Analyzing the protocol for security flaws was however not the primary goal here, and this result is also kind of expected.

Final Thoughts

This blog post described my journey of reverse-engineering a custom 433MHz radio protocol for a wireless switch. One learning was that I should have followed a structured approach right from the beginning, which would have saved me quite some time. On the other hand, it’s mistakes we often learn from, so I guess that’s just fine.

Maybe it is already obvious, but the results described here are based on a bit of guesswork and the number of tested switches is rather small. So maybe some aspects about the packet structure are not entirely correct. However, given more testing, this should be easy to correct.

Of course there is way more to discover in this area, but analyzing the simple protocol used by my wireless light switches seemed like a good target to get started.

In case you have questions, comments or feedback, please feel free to get in touch!