Bit-flips are inadvertent changes of bits (0’s and 1’s) caused by hardware, software or external factors. Spacecrafts use error correction sofwares and hardware redundancy to protect against bit-flips.
Do you remember the dreaded blue screen of death on your old Windows XP system? Nothing is more annoying than having your system go down inexplicably in the middle of a kill-streak in an online Counter Strike match. Most of the time, however, rebooting the system is enough to have your system come up back online.
However, what caused the system to crash on your PC? Computers are employed everywhere today, and in some critical areas, such as a spacecraft, having the system crash could have quite unpleasant consequences, especially if it happens inexplicably!
There are many phenomena that can cause systems to crash, one of which is a Bit-Flip. Spacecraft must be protected from them to operate safely. Let’s read more about Bit-Flips before reading about the protective measures employed by spacecraft to avoid calamities.
Recommended Video for you:
Bit-Flip: The Name Tells The Story
Digital data is stored as a sequence of 0s and 1s. Each 0 or 1 is called a bit. A bit-flip is an inadvertent change of state of a bit that is different from its initial state. For example, if a bit changes from 0 to 1 or vice-versa accidentally, the event is called a bit-flip.
The most common way of storing bits is using floating gate transistors, where the presence or absence of voltage between two terminals indicates the state of the bit (1 or 0).
There are many types of bit-flips. To cause a bit-flip, there must be an accidental flow of electrons within the transistor’s gates, which would change the voltage between the source and the sink. This can occur through high temperature, hardware issues, and cosmic radiation, among others.
For spacecraft, temperature and hardware are subject to rigorous testing before being deployed, which eliminates the possibility of internally caused bit-flips. This leaves only cosmic bit-flips as potential troublemakers.
Again, as the name suggests, cosmic bit-flips are inadvertent changes of bits with causes of cosmic origin.
Cosmic rays are streams of subatomic particles moving at almost the speed of light. These are composed mainly of atoms that have been stripped of their electrons. Hydrogen and Helium are the most abundant elements in the universe. Thus, cosmic rays consist primarily of protons (Hydrogen atoms without an electron) and alpha particles (Helium atoms without electrons) and they are positively charged. Since they are charged and move at extremely high speeds, they produce magnetic fields of their own, which interact with the magnetic field of the earth.
When cosmic rays reach the earth, some of them get deflected by the planet’s magnetic field. The rest, which manage to enter Earth’s atmosphere, strike air molecules, and cause a cosmic ray cascade.
Cosmic Ray Cascade
This is a cascading effect where cosmic rays strike the upper atmosphere, lose energy, and decompose into other particles with lower energy, which strike air lower down in the atmosphere, creating even more elementary particles with lower energy. The end result is a cascade of particles (muons, alpha particles, protons, etc.) hitting the ground. When these particles strike transistors, they cause can bit-flips.
Interfering With Electronics
In modern electronic devices, Floating-Gate Metal Oxide Semiconductor (FGMOS) transistors are used to store data. A Floating Gate (FG) is an electrically isolated terminal that stores electrons. When electrons are present in the FG, they create a small electric field that affects the flow of electrons in the transistor. The presence of electrons in FG can be measured and the state of the bit (1 or 0) can be inferred. When charged particles strike FGMOS, they interact with the electrons in FG, causing the unintended flow of electrons to and from FG, resulting in the unintended loss or gain of electrons in FG, causing a bit-flip.
Spacecraft Protection From Cosmic Bit-Flips
If an input from a system has to be read, three physical subsystems provide the input. If two agree and the third doesn’t, then the third is ignored and the concordant value from the other two is taken. This is called 3-way redundancy. More redundancy may be employed, as in the Shuttle program, which uses 5-way redundancy, where up to two faulty subsystems can be overridden.
Error Correction Codes (ECCs)
Software is used to detect and correct errors. Data (a sequence of 0’s and 1’s) is divided up into blocks that are exponential powers of 2. Redundancy bits are added, which ensure that the (parity) number of 0’s or 1’s remains either even or odd (determined by the sender). For example, if the sender determines that the parity of 1 should be odd, then the receiver counts the number of 1’s in the message, and if the number of 1’s doesn’t remain odd, then an error is detected. Using this method, only 1-bit errors can be detected. Adding more parity bits is used to detect more than 1-bit errors.
A Quick Recap
Thus, we’ve seen that bit-flips are accidental changes in data that may be caused by a variety of factors. When the cause of the bit-flip is cosmic rays, then it’s called a cosmic bit-flip. Spacecraft must be protected from cosmic bit-flips, as such errors can jeopardize the mission objectives and crew safety.
The most commonly used remedy is to employ hardware redundancy and software Error Correction Codes. Hardware redundancy ensures that multiple sub-systems feed the input for a single system and take only that input that matches with the majority. Software ECCs employ algorithms to detect changes in individual bits by keeping a count of the parity of 0’s or 1’s at the sender and receiver sides.