4th August, 2008

Debugging ACPI Using WinDBG /--evilbitz   

Hi,

Here are some tips about debugging Windows ACPI DSDT/ASL using windbg.

Installing the checked version of acpi.sys

You need to get the checked version of acpi.sys by downloading the checked version of your service pack, then unpack it locally and expand the acpi._sy file (it is actually a .cab file). The checked version will let you use the amli debugger in order to trace and step through ASL code.

Tracing ACPI ASL Code and Object evaluation

!amli set traceon spewon verboseon – This is a bit slow but produces a nice log file (for real man only).

ASL Debug Print

If you can change the code (dump and disassemble the DSDT and then compile and embed it again), you can add some string outputs to the ASL code, you can do that by two ways, if you connect a debugger then use the simple method of storing a string into the Debug local variable (example below), the other way is to use my asl print function which prints to an io port of your choice, this is not useful if you are not a platform developer or use a virtual machine.

Examples:

Store (“Debug asl print example – 1″, Debug)
\ZDBG (“Debug asl print example – 1″)

Break Points

  • If you want to debug ASL code, you can set breakpoints with !amli bp
  • You can embed a breakpoint by changing the DSDT and put the BreakPoint directive in the ASL code where you want the debugger to break.

After you broke onto the amli debugger, you can trace and step() through the code.



Posted in lowlevel, programming | Be The First To Comment!

21st July, 2008

An implementation of ACPI ASL print function for Xen /--evilbitz   

Hi All,

This post is more intended for Xen developers but you might be able to understand a thing or two.

ASL code exists in the DSDT and SSDT tables of ACPI. This code is provided by the system manufacturer and is intenteded to provide an abstract interface for configuring and accessing the hardware, especially those integrated parts such as the embedded controller, etc… The operating system has an interpreter which execute ASL in its context, this is how you write hardware specific code that can be executed on any ACPI compliant OS.

Part of my work now involves merging ACPI ASL bits in order to support PCI/PCIe pass-through capabilities. In order to debug ASL code, I’ve made a debug function which gets a string as an argument and then write each byte of the string to the IO port 0xE9, this port is being used by Xen for HVM debugging, so basically if you have your UART connected, all the strings will appear there.

Here is the code:

|     /* Debug ACPI using io-port 0xE9 */
|    OperationRegion (DBGP, SystemIO, 0xE9, 0×01)
|    Field (DBGP, ByteAcc, NoLock, Preserve)
|    {
|       /* HVM debug char */
|       HDBG, 8
|    }
|
|    Method (ZDBG, 1, NotSerialized)
|    {
|       /* Local0 – length of the debug string */
|       Store (SizeOf (Arg0), Local0)
|       Increment (Local0)
|       /* Init STR buffer from Arg0 */
|       Name (STR, Buffer (Local0) {})
|       Store (Arg0, STR)
|
|       /* Append prefix “ACPI-DBG: ” */
|       Name (PRFX, Buffer () {
|          0×41, 0×43, 0×50, 0×49, 0x2d,
|          0×44, 0×42, 0×47, 0x3a, 0×20
|       })
|
|       /* INPR -> Concatenated string with prefix */
|       Add (Local0, 0x0a, Local0)
|       Name (INPR, Buffer (Local0) {})
|       Concatenate (PRFX, STR, INPR)
|
|       /* Output string to ioport HDBG */
|       Store (Zero, Local1)
|       Decrement (Local0)
|       While (LLess (Local1, Local0))
|       {
|          Store (DerefOf (Index (INPR, Local1)), HDBG)
|          Increment (Local1)
|       }
|       /* End with a newline */
|       Store (0x0a, HDBG)
|    }

Thanks!



Posted in lowlevel, programming | 1 Comment

23rd December, 2006

A Simple Python CPU Emulator /--evilbitz   

Emulators in computer science are computer software that emulates an environment for another software to run, it may include some hardware devices as well, but if a complete environment is emulated that it will be usually called a virtual machine. An example for such enironment is DOSEMU, a DOS operating system environment which is emulated and designed to run on top of Linux operating system. The core for any software emulator is the CPU emulation part, many virtual machines uses the CPU that the bochs project had created, it can run x86, AMD64, and PPC software.

I played a bit with emulators in the past, and I created a simple python CPU emulator that emulates “flat” assembly code that can be defined by a configuration file. In this post I’ll describe its features, how does it work and what can be achieved by using it.

Simple Python CPU Emulator

The CPU emulator itself has a couple of runtime variables that are being used by the parsers, it moves line by line over the assembly that you tell it to emulate, and if the current assembly line matches a pattern (defined in the configuration line) then it evaluates another pattern, according to the run-time state of the CPU registers.

The configuration file of the emulator has two parts, it defines the registers that the cpu holds and the instructions that the emulated CPU can execute. The design is very simple and elegant, when the emulator initializes it prepares an opcode python dictionary. For instance, the opcode INC is defined as follows in the configuration file:

INC $reg1 | $reg1 += 1

When the parser encounters that configuration line, it adds the following key => value to the dictionary:

opcodes["INC (.*?)"] = “cpu[execed.group(1)] += 1″

Where cpu[execed.group(1)] is the run-time CPU register that is matched by the CPU emulator parser. When the parser reads a new assembly line, it first moves through all the keys of the opcodes dictionary, and tries to do a regular expression match. If it finds one, then the value of the dictionary is being simply evaluated. Here is the source code of the emulation loop:

01: while (cur_op_line < end_of_ops):
02: opcode = a_ops[cur_op_line]
03: execed, ptrn = get_exec(opcode)
04: if (execed):
05: if a_verbosity: print "%d:\t%s" % (cur_op_line+1, opcode)
06: exec(opcodes[ptrn])
07: else:
08: print "[-] havn't found a match for", opcode
09:
10: cur_op_line += 1

A Riddle For Fun

My friend Imri gave me this riddle when he took a course in complexity at the Tel Aviv University.

You a have a CPU with 4 registers (r0 through r3) and 3 basic opcodes that works as described:

  • INC reg – increases the value of reg by 1
  • DEC reg – decreases the value of reg by 1
  • TEST reg, line – if reg != 0 then start executing opcodes at the specified line

Download the source code, it comes ready with the configuration file for this exercise, and write a program that will end the execution when one of the registers contains the value 144.

  • The CPU starts when all the registers are initialized to zero.
  • The program length must not exceed 20 instructions.

Have pun! :-)



Posted in design, programming | 2 Comments

8th December, 2006

Interrupts and Interrupt-Controllers /--evilbitz   

Abstract

This article is kind of a continuation for my under-the-hood article series, you can take a look at my previous articles regarding the PCI bus. In this article, I delve further deeper into interrupts, and we’ll explore what exactly happens when an interrupts occurs.

If you are a software developer, this information will help you grasp the PC architecture a little bit better. This article’s intent is to encourage people to explore things and widen their knowledge… I see too many people which are trapped inside their operating system’s environment, especially if they are using the M$ black box. This is the only reason I see for switching to Linux :-) Ok, so let’s start!

Interrupts

An interrupt is basically a signal from the hardware that tells the software to perform an operation. It is handled by the operating system that calls the ISR (Interrupt Service Routine) for that interrupt request (IRQ).

Generally, we distinguish between two different types of interrupts, Edge-Triggered and Level-Triggered. Edge-Triggered interrupts are interrupts which are being caused by changing the bus line level, it is basically a transition from a 1 to 0 or from 0 to 1 (falling-edge and rising-edge repectively). This “old fashion” type of interrupt was used in the ISA bus. The problem with this type of interrupt is that it is difficult to be shared, that means, several devices couldn’t shared the same IRQ line. Level-Triggered interrupts are being caused by raising or lowering the level of the bus line and holding it right there until the interrupt is serviced, they were used in the original PCI bus (and are still being used) as the standard type of interrupt.

Interrupt Controllers

The interrupt controller’s goal is to provide interrupt capabilities to the main processor (CPU) through a single line, when a device issues an interrupt, it is delivered to one of the interrupt controller’s IRQs (pins), from there the interrupt is generated in the CPU, which, in turn, checks with the interrupt controller for the source of the interrupt through a special register which are being hold at and managed by the interrupt controller. Old interrupt controllers, such as the PIC (Programmable Interrupt Controller) provided interrupt-priority, interrupt-masking and general flexibility for dealing with interrupts in the platform. Old devices where programmed to use fixed IRQs and problems arose when two or more devices shared the same IRQ, if the PIC was programmed to be used in the edge-triggered mode, then serious conflicts could cause the system to hang or not function at all. Well… In the edge-triggered mode interrupt actually could be shared if the devices were specially built for this event. But since the operating system must run all the ISRs that exists in the chain that is associated with that specific IRQ that is signaled, it is not so effective after all.

In the Level triggered mode, the PIC knew how to share IRQs between different devices, but sharing interrupts is not a good deal in any case, this of course leads to performance issues and faults that are being caused by poorly written device drivers.

Consider the following scenario: Device A, which shares his IRQ with Device B, signals its driver, the interrupt is issued and the operating system processes it. The chain of ISRs contains two different ISRs. First the driver for Device B is processing the interrupt becuase he is first in the chain, and actually decides that he is going to handle the interrupt (because its poorly written ISR handles any interrupt). The operating system sees that the interrupt was handled and stops executing the ISR-chain and acknowledges the PIC. After some time Device A sees that the interrupt wasn’t handled by its driver and issues the same interrupt again. This scenario leads to interrupt storms or causes Device A to stop function.

To solve this problem and allow more flexibility, Advanced Programmable Interrupt Controllers (APICs) were introduced, they contain more IRQs and their function is better adapted to the operating system. Windows, for example uses its IRQL mechanism to mask interrupts in the APIC, this is accomplished by a single mov assembly instruction, instead of runing several I/O port instructions (such as in or out) in the case of using a PIC.

APICs are being used in Multiprocessor environments, each CPU has its Local APIC and another controller, called I/O APIC actually routes interrupts from the bus to the LAPIC. This allows greater flexibility, and sharing interrupts is no longer an issue, since each LAPIC has 24 IRQs. Another advantage is that each APIC has it’s own timer, this allows each CPU to better schedule the CPU time distributed between threads in quantum units. APICs are also supports IPIs (Inter-Processor-Interrupt), a way of one processor to interrupt another processor. IPIs are being used for synchronization and cache-coherency. The I/O APIC’s function is to distribute interrupts between the CPUs in a multi-processor environment.

Final Words

If you bared with me to this point, It is surely admirable! even I couldn’t bare with myself writing this post :-)

Anyway, we saw how interrupts are being generated, how they are routed at some platforms which are based on the PIC and APIC implementation and we also took a look at how the operating system handles interrups. I hope this post was enjoyable for you.

Evilbitz.



Posted in design, lowlevel, programming | 15 Comments

2nd November, 2006

Brief Introduction to PCI – Part II /--evilbitz   

This is the second part of my introduction to PCI article. If you haven’t read the first part, it is highly recommended that you’ll read it first.

So, in part I, we learned that all our PCI devices are connected through this main bus line to the CPU and memory. Any entity in the system can use the PCI bus, but it will first ask the bus arbiter to do so. A bus master is any entity that holds the bus and has an ability to write to it.

Any device can hold memory resources that are being used to transfer data between the driver and the device. Those addresses may be used for storing the current display buffer for example. This memory resource is part of the main computer’s RAM, so if the driver will want to copy data to the buffer, it will just copy the data to the RAM, to the address where the BIOS mapped the resource to.

BIOS stands for Basic Input and Output System, it is reliable to boot the system and provide it with some basic capabilities like primitive display adapter support. One part of the bios, moves through the existing PCI devices and supplies them with the memory resources that they require. After doing so, it boots the operating system which enumerates the PCI devices and existing memory resources.

Each PCI device is identified by a device ID and a vendor ID, both are part of the standard 64 bytes configuration space that each PCI device holds. So after the operating systems probes for this info, it searches for an appropriate driver to handle this device, this is being done by matching the device/vendor IDs.

In windows there is a kernel module which is called the Plug and Play manager that loads the right driver and let it initialize itself. There is also a file, for each driver, that describes these IDs. These files are called INF files and they provide info for the PnP manager for loading and installing the appropriate driver. To look at some examples, you can dig in the %windir%\inf directory.

Let’s go back to the memory resources. As I said, these are just some memory regions from the computer’s RAM that are being mapped to the PCI device as well. When a driver needs to write data it may utilize the main CPU for this task or it may use a special hardware unit which is called DMAC.

DMAC stands for Direct Memory Access Controller and it’s basically a processor that knows how to copy data from one place to another. Drivers are using this controller to carry out copying operations thus letting the main CPU be free to carry out other tasks.

DMA may be operate in two ways, the standard one, which by the DMAC reads each time the following byte from the source (the PCI device for example) and copy it to it’s destination (the RAM).

More sophisticated DMACs are using a method which is called flow-through. By which the controller just playing with signals over the bus and holds no actual data by itself. It operates in this manner: on the same time it issues a read operation to the device and a write operation to memory, letting the signals flow smoothly over the bus, the DMAC will keep doing this until it will finish its transaction.

I hope that this series of articles where informative and enjoyable for you. Keep to check out for new posts on this cool blog! :-)



Posted in design, lowlevel, programming | 5 Comments

Top »
"If you can't join them, beat them!"
Search Evilbitz: