January 14, 2018
The previous post dealt with the LCD hardware and the specifics of the ICs. This post will focus on the driver code used to create the sequence of signals previously discussed. I’m using a TI dev board with the Cortex M4 core TM4C123GH6PM microcontroller.
Before diving into the details of getting pixels to actually show up in the proper locations on the screen, let’s walk though a high level overview of how the driver is going to be set up.
readBuffer
and writeBuffer
.writeBuffer
. Upon loading the last sprite, the higher level code will let the driver know that the frame is complete.readBuffer
will be loaded into the LCD controller ICs and latched out to display on the screen.The following digram shows a brief system overview limited to the portions of the system directly involved in using the LCD screen. In this post I’m going to focus on how data gets from memory (the buffer) to being displayed on the LCD. Future posts will delve into detail on how the sprites are stored and how the buffers get updated.
The modules involved are:
The file main.c calls the appropriate initialization functions and then enters the infinite while loop. In the loop a check is performed to see if image data should be updated, that is, if a buffer is available to write to. If true, sprite states are updated and loaded into the current writeBuffer.
The most relevant parts of main.c are shown below.
int main(void){
DisableInterrupts();
PLL_Init();
IO_Init();
EnableInterrupts();
//Down the rabbit hole we go
while(1){
if(IO_Ready())
{
DisplayTests_DrawDiag();
DisplayTests_DrawBorder();
IO_UpdatesCompleted();
}
}
}
The IO module has public functions declared in IO.h. The relevant public functions are shown below. The first three are called by main.c. The last, LoadSprite(), is called by DisplayTests.c.
void IO_Init(void);
bool IO_Ready(void);
void IO_UpdatesCompleted(void);
void IO_LoadSprite( const uint16_t xpos,
const uint16_t ypos,
const Sprite_t sprite );
The screen module has public functions which are only called by IO.c. No other module is aware of its existence. The header file, screen.h, contains the declarations of these public functions as well as the screen width and screen height in pixels.
#define SCREEN_W 480
#define SCREEN_H 64
void Screen_Init(void);
void Screen_SetBufferIsFull(void);
bool Screen_IsBufferAvailable(void);
void Screen_ClearBuffer(void);
uint8_t * Screen_GetBuffer(void);
Alright, now that the overview is out of the way, let’s get into the meat of turning a bunch of ones and zeros into pixels on a screen.
A few screen related constants, two arrays large enough to hold an entire screen of data, and two pointers to those arrays are initialized at the top of screen.c.
#define FRAME_REFRESH_HZ 11
#define LCD_REFRESH_HZ 44
#define BUFFER_SIZE (SCREEN_W*SCREEN_H/8)
uint8_t ScreenBufferA[BUFFER_SIZE] = {0};
uint8_t ScreenBufferB[BUFFER_SIZE] = {0};
uint8_t *writeBuffer = NULL;
uint8_t *readBuffer = NULL;
There are five data signals which need to be handled.
These will use pins 4-0 on port F. The TM4C line of uCs has a feature which allows bits [9-2] of the address used to access a GPIO register to act as a mask, thus enabling atomic read and write operations on any combination of pins. This is known as “bit banding” or “bit-specific addressing.” See the following sections of the datasheet for more info:
and this webpage (ctrl-f for ‘bit-specific addressing’).
The first three signals will use PF1, PF2 and PF3 (respectively).
#define DataClk_PF1 (*((volatile unsigned long *)0x40025008))
#define LatchClk_PF2 (*((volatile unsigned long *)0x40025010))
#define FrameClk_PF3 (*((volatile unsigned long *)0x40025020))
The two serial data signals will always be written together, so a single mask will suffice.
#define SerialData_PF04 (*((volatile unsigned long *)0x40025044))
As an example, the address for accessing PF0 and PF4 simultaneously is calculated via:
Address \( =\textrm{Base_Address} + ((2^4 + 2^0)*4) \)
where the Base-Adress for the port F data register on the APB (Advanced Peripheral Bus) is 0x4002.5000. (Multiplying by 4 is equivilant to bit shifting left by 2.)
Now writing is as simple as:
DataClk_PF1 = (1<<1); //Make high
DataClk_PF1 = 0; //Make low;
SerialData_PF04 = (1<<4)|(1<<0); //Make both high
SerialData_PF04 = 0; //Make both low
SerialData_PF04 = (0<<4)|(1<<0); //Make PF4 low and PF0 high
These are great because they can’t effect any other pins than those intended, are way faster than the standard read-modify-write sequence, and because the operations are atomic (can’t be interrupted.)
The Screen_Init() function has three jobs.
void Screen_Init(void){
//////////////////////////////////////////////////
//Enable GPIOs for connecting to LCD //
//////////////////////////////////////////////////
SYSCTL_RCGC2_R |= (1<<5); // activate port F clock
while((SYSCTL_PRGPIO_R & (1<<5)) == 0){;} // Wait for clock to stabalize
GPIO_PORTF_LOCK_R = 0x4C4F434B; // unlock PortF PF0
GPIO_PORTF_CR_R = 0x1F; // allow changes to PF4-0
GPIO_PORTF_DIR_R |= ((1<<4)|(1<<3)|(1<<2)|(1<<1)|(1<<0)); // make PF4-0 out (1)
GPIO_PORTF_AFSEL_R &= ~((1<<4)|(1<<3)|(1<<2)|(1<<1)|(1<<0)); // disable alt funct
GPIO_PORTF_DR8R_R |= ((1<<4)|(1<<3)|(1<<2)|(1<<1)|(1<<0)); // can drive up to 8mA out
GPIO_PORTF_DEN_R |= ((1<<4)|(1<<3)|(1<<2)|(1<<1)|(1<<0)); // enable digital I/O
GPIO_PORTF_AMSEL_R &= ~((1<<4)|(1<<3)|(1<<2)|(1<<1)|(1<<0)); // no analog
GPIO_PORTF_DATA_R &= ~((1<<4)|(1<<3)|(1<<2)|(1<<1)|(1<<0)); //set to zero
//////////////////////////////////////////////////
//Enable systick interrupts for writing to LCD //
//////////////////////////////////////////////////
uint32_t sysTickReloadCount = (80000000/LCD_REFRESH_HZ/SCREEN_H);
NVIC_ST_CTRL_R = 0; // disable SysTick during setup
NVIC_ST_RELOAD_R = sysTickReloadCount-1; // reload value
NVIC_ST_CURRENT_R = 0; // any write to current clears it
NVIC_SYS_PRI3_R = (NVIC_SYS_PRI3_R&0x00FFFFFF)|0x20000000; // priority 1
NVIC_ST_CTRL_R = 0x07; // enable with core clock and interrupts
writeBuffer = ScreenBufferA;
readBuffer = ScreenBufferB;
}
The following functions make up the public interface which is accessed by the IO.c module. The names should be fairly self explanatory.
uint8_t * Screen_GetBuffer(void)
{
return writeBuffer;
}
void Screen_SetBufferIsFull(void)
{
writeBufferIsFull = true;
writeBufferAvailable = false;
}
bool Screen_IsBufferAvailable(void)
{
return writeBufferAvailable;
}
void Screen_ClearBuffer(void)
{
uint16_t i;
for(i=0;i<BUFFER_SIZE;i++){
writeBuffer[i]=0x00; //zeros correspond to the default clear pixel state
}
}
Each time the interrupt service routine is called, the row corresponding to the current value of rowIndex is loaded into the LCD. The value of rowIndex loops from zero up to SCREEN_H-1. A row stays active on the display until the next time the interrupt runs. The second part of the ISR only runs if the last row of a frame was just displayed, if the current frame has been displayed enough times, and if the new frame is ready. If all these conditions are met, then the read and write buffers are switched.
void SysTick_Handler(void){
static const uint8_t minLcdUpdatesPerFrame = LCD_REFRESH_HZ/FRAME_REFRESH_HZ;
static uint8_t rowIndex = 0;
static uint8_t lcdUpdates = 0;
PrintNextRow(rowIndex);
rowIndex = (rowIndex+1)%SCREEN_H;
if(rowIndex==0 and ++lcdUpdates>=minLcdUpdatesPerFrame and writeBufferIsFull)
{
SwitchBuffer();
writeBufferAvailable = true;
lcdUpdates = 0;
writeBufferIsFull = false;
}
}
static void SwitchBuffer(void)
{
if(writeBuffer == ScreenBufferA)
{
writeBuffer = ScreenBufferB;
readBuffer = ScreenBufferA;
}
else
{
writeBuffer = ScreenBufferA;
readBuffer = ScreenBufferB;
}
}
The flowchart below shows the function PrintNextRow() and the functions it calls. The LCD is split into a left and right half so the first bit of the left half and the first bit of the right half get sent on the same falling edge of the data clock, etc… The falling edge of the latch clock tells the LCD ICs to take their data buffers and output them to their physical pins as high or low voltages. This signal also tells the IC controlling the rows to shift to the next row. The frame clock shifts in a new high value to the row IC which then gets shifted through each row by the latch clock signal until shifting off the last pin on the IC. This way the rows can be activated one at a time.
The following code is logically equivalent but has been modified from the simplest implementation to achieve the 3.3 MHz data rate.
These two changes drastically improved the data rate, actually getting over 5 Mhz. The delayns() is used to tune the data rate to close to 3.3 MHz as measured on a scope.
static void PrintNextRow(uint8_t rowIndex){
uint8_t bytesPerRow,colIndex;
uint8_t bufferByteLeft, bufferByteRight,bitValLeft,bitValRight;
bytesPerRow = SCREEN_W>>3;// divided by 8 (bytes)
for(colIndex=0;colIndex<(bytesPerRow>>1);++colIndex){
bufferByteLeft = readBuffer[rowIndex*bytesPerRow+colIndex];
bufferByteRight = readBuffer[rowIndex*bytesPerRow+colIndex+(bytesPerRow>>1)];
DataClk_PF1 = (1<<1);
bitValLeft = (bufferByteLeft&(1<<7))>>7;
bitValRight = (bufferByteRight&(1<<7))>>7;
SerialData_PF04 = ((bitValRight)<<0)|((bitValLeft)<<4);
DataClk_PF1 = 0x00; //Clocks in data on falling clock edge
delayns();
DataClk_PF1 = (1<<1);
bitValLeft = (bufferByteLeft&(1<<6))>>6;
bitValRight = (bufferByteRight&(1<<6))>>6;
SerialData_PF04 = ((bitValRight)<<0)|((bitValLeft)<<4);
DataClk_PF1 = 0x00; //Clocks in data on falling clock edge
delayns();
DataClk_PF1 = (1<<1);
bitValLeft = (bufferByteLeft&(1<<5))>>5;
bitValRight = (bufferByteRight&(1<<5))>>5;
SerialData_PF04 = ((bitValRight)<<0)|((bitValLeft)<<4);
DataClk_PF1 = 0x00; //Clocks in data on falling clock edge
delayns();
DataClk_PF1 = (1<<1);
bitValLeft = (bufferByteLeft&(1<<4))>>4;
bitValRight = (bufferByteRight&(1<<4))>>4;
SerialData_PF04 = ((bitValRight)<<0)|((bitValLeft)<<4);
DataClk_PF1 = 0x00; //Clocks in data on falling clock edge
delayns();
DataClk_PF1 = (1<<1);
bitValLeft = (bufferByteLeft&(1<<3))>>3;
bitValRight = (bufferByteRight&(1<<3))>>3;
SerialData_PF04 = ((bitValRight)<<0)|((bitValLeft)<<4);
DataClk_PF1 = 0x00; //Clocks in data on falling clock edge
delayns();
DataClk_PF1 = (1<<1);
bitValLeft = (bufferByteLeft&(1<<2))>>2;
bitValRight = (bufferByteRight&(1<<2))>>2;
SerialData_PF04 = ((bitValRight)<<0)|((bitValLeft)<<4);
DataClk_PF1 = 0x00; //Clocks in data on falling clock edge
delayns();
DataClk_PF1 = (1<<1);
bitValLeft = (bufferByteLeft&(1<<1))>>1;
bitValRight = (bufferByteRight&(1<<1))>>1;
SerialData_PF04 = ((bitValRight)<<0)|((bitValLeft)<<4);
DataClk_PF1 = 0x00; //Clocks in data on falling clock edge
delayns();
DataClk_PF1 = (1<<1);
bitValLeft = (bufferByteLeft&(1<<0))>>0;
bitValRight = (bufferByteRight&(1<<0))>>0;
SerialData_PF04 = ((bitValRight)<<0)|((bitValLeft)<<4);
DataClk_PF1 = 0x00; //Clocks in data on falling clock edge
}
if(rowIndex==0)
FrameClock();
else
LatchClock();
}
//Latch Clock: LatchClk_PF2
static void LatchClock(void){
unsigned long i=17; //specific number doesn't matter
LatchClk_PF2 = (1<<2);
i=i/i; //Delay enough to keep peak at least ~80ns wide
LatchClk_PF2 = 0x00;
}
//DIO1: FrameClk_PF3
static void FrameClock(void){
FrameClk_PF3 = (1<<3);
LatchClock();
FrameClk_PF3 = 0x00;
}
static void delayns(void){
unsigned int i;
i=i/i/i;
}
The code I have now is working quite well, but I think I could utilize a dual SPI setup to do the actual sending of each row. I believe there are some microcontrollers which can send multiple streams of data on a single SPI peripheral, but I don’t believe the one I’m using can. But I might be able to use two SPI peripherals and have them share a single clock.
If I get SPI working I’d also like to utilize DMA for transferring data to the SPI peripheral.