Amiga-Development

Please login or register.

Login with username, password and session length
Advanced search  

News:

Created for developers of all Amiga camps

Author Topic: Disassembling PPC code on OS4  (Read 2587 times)

0 Members and 1 Guest are viewing this topic.

adminZRt75

  • Administrator
  • Hero Member
  • *****
  • Posts: 591
    • View Profile
Disassembling PPC code on OS4
« on: December 25, 2012, 07:49:40 PM »

Started by Hypex on utilitybase

Hypex
Member

   Posted: 2006-Aug-6 16:45:49

Hi.

I'd like to know what is the easiest way to disassemble PPC code on OS4, I'm just interested what some code looks like, either from a file or in memory.

I have seen Scout disassemble, though raw PPC, let alone PPC is hard enought to understand. I thought using the GNU debugger might help here, would that be the best move, or is there another tool I can use?



itix
Member

   Posted: 2006-Aug-6 18:16:44

objdump --disassemble-all --reloc MyExe >MyExe.s



rachy
Flower Power! @-->--

   Posted: 2006-Aug-6 19:23:34

@Hypex: if you need runtime PPC disasm dump, then you can make use of the Debug interface of ExecSG. After opening it you can simply call DisassembleNative() function, with the parameters:
      char opcode[200];
char operands[200];

IDebug->DisassembleNative(targetaddress,opcode,operands);



It will copy the opcode and the operands to the strings respectively.



Hypex
Member

   Posted: 2006-Aug-7 16:51:09

@All.

Thanks guys. I think I may have come across those disassembling functions, I just wondered if there was an easy way of looking at code. Especially because AFAIK there is nothing yet like MonAm for OS4, even as simple as it is. Where are those 68k coders that write heaps of debuggers? We need them now!



rachy
Flower Power! @-->--

   Posted: 2006-Aug-7 20:50:02

Hypex: you could have a shot to GDB, it is a bit complicated to use, but if you get used to it... BTW, there is a GUI to it, I don't know if it is already released to the public or not.



Hypex
Member

   Posted: 2006-Aug-8 16:46:14

Hello again.

I tried using the information to write a simple debugger, but the result is just crashing, because obviously I've messed up the pointers. This is what I hate in C, I never know what is what, with 68k ASM and E I know how to work it but this just eludes me.

Check this out, and will see instantly the errors of my ways, I bet. It's called Dasm.

For example, if I give this as an argument, which just happened to be in executable space by chance on my machine I get this result (without comments)
:
Dasm 0x1234560

Addr: $1234560       Good.
$719C0304:         Huh?!   
      #include <stdio.h>
#include <proto/exec.h>
#include <exec/execbase.h>
#include <exec/memory.h>
#include <proto/dos.h>

int main(int argc, char *argv[])
{
   struct DebugIFace *IDebug;
   APTR addr;
   ULONG i;
   LONG chars;
   char opcode[200];
   char operands[200];

   IDebug=(struct DebugIFace *) IExec->GetInterface(SysBase, "main", 1, 0);
   if (IDebug)
   {
       if ((chars=IDOS->HexToLong(argv[1], (ULONG *)&addr)!=-1) && IExec->TypeOfMem((ULONG *)addr)==MEMF_EXECUTABLE)
       {
           printf("Addr: $%lx\n",addr);
           opcode[0]=0;
           operands[0]=0;
           for(i=0;i<8;i++)
           {
               addr=IDebug->DisassembleNative(addr,opcode,operands);
               printf("$%08X: %-8s %s\n",addr,opcode,operands);
           }
       }
       printf("S:%s C:%d, H:%lx, P:%lx\n", argv[1], chars, addr, &addr);
       IExec->DropInterface((struct Interface *)IDebug);
   }
}



abalaban
Member

   Posted: 2006-Aug-8 17:55:26

Hello Hypex,

try to replace      printf("$%08X: %-8s %s\n",addr,opcode,operands);



byopcode[199]=operands[199]='\0'; // ensure no overflow
printf("$%08X: '%-8s' '%s'\n",addr,opcode,operands);



and give us the result here.



tboeckel
Member

   Posted: 2006-Aug-8 20:08:14

@hypex
      IDebug=(struct DebugIFace *) IExec->GetInterface(SysBase, "main", 1, 0);



Try to GetInterface() the "debug" interface instead of the "main" interface. I think it will work then.



abalaban
Member

   Posted: 2006-Aug-9 08:44:17

@tboeckel

well spotted, I would have swear I looked at that first too ;-)



rachy
Flower Power! @-->--

   Posted: 2006-Aug-9 09:21:04 · Edited by: rachy

@abalaban
      opcode[199]=operands[199]='\0'; // ensure no overflow



This helps nothing against overflow, I hope you aware of this...
Yep, I know that these functions are not safe to use (in overflow-wise), but 200 bytes "must be enough for everything". ;)



abalaban
Member

   Posted: 2006-Aug-9 09:32:58 · Edited by: abalaban

Hello Rachy,
      This helps nothing against overflow, I hope you aware of this...



'Overflow' may not be the right english term, what I meant was that with this I ensure there is at least an '\0' in both strings thus preventing printf to go beyond the 200 char limit. For me this is also a kind of overflow.



Hypex
Member

   Posted: 2006-Aug-9 16:47:49

@abalaban

I get this:

7.Work:Development/SDK/Examples/Basic> cc Dasm.c
Dasm.c: In function 'main':
Dasm.c:23: warning: assignment makes integer from pointer without a cast

Line 23 has the assignment.

Then, when running it, I get this, with a crash:

Dasm 0x1235600
Addr: $1235600
$01235600: Ä    G|0
$7191030C: Ä    G|0
$7191030C: Ä    G|0
$7191030C: Ä    G|0
$7191030C: Ä    G|0
$7191030C: Ä    G|0
$7191030C: Ä    G|0
$7191030C: Ä    G|0
S:0x1235600 C:1, H:7191030c, P:2477b34

But, although I cannot prove it now since the answer is here, I worked it out eariler tonight examing my code. Will explain below.



Hypex
Member

   Posted: 2006-Aug-9 17:02:19

@tboeckel

Yes, that is it, good work. After wondering what was happening I looked at the source and commented out the printf() and Dis#?() call to see what crashed. In the Grimy window, the task was trying to write in the target address, which didn't seem right. Although it is called target, not source in the AutoDoc, so that coudld have meant anything. It also looked like it was trying to set a node pointer or similar. Then I went back and wondered if I had set up the interface to an actual pointer, I had and noticed it was "main", aha!

I changed it and it all started working. I'm thinking this would be a good example of easy mistakes that can be made in code producing a bad result.

If you look into it further like I did, it is just retrieving the main Exec interface pointer and according to the function slot calling AddMemHandler(). It thinks my target adresss is an interrupt!

I do wonder about these disassembling functions, because they assume things about the size of the strings, and don't leave an option for parameter passing. It would be good to specify string length, in case of "overflow", and have some options such as specifying hex numbers only in the operands or using the tradition assembly "$" instead of the C "0x". Otherwise, very good and useful function.



rachy
Flower Power! @-->--

   Posted: 2006-Aug-10 08:29:06 · Edited by: rachy

@abalaban      I meant was that with this I ensure there is at least an '\0' in both strings thus preventing printf to go beyond the 200 char limit



Ah, got it. Yes, that is overflow, but not particularly useful, only if the function has no proper string handling (forgets to put EOS at the end of the string), but that is a bug. If it overwrites the 200 chars then it will surely overwrite the EOS too... :/

@HypexIt would be good to specify string length, in case of "overflow", and have some options such as specifying hex numbers only in the operands or using the tradition assembly "$" instead of the C "0x".



You are absolutely right, but on the other hand: it is hardly possible to overrun 200 chars in these functions. There is no such opcode in PPC assembly which would be longer than 8 chars (would be lovely to have a 100 chars long opcode ;) neither the operands won't be ever longer than 20 chars at best. So don't be afraid of running over the bounds of your string.

About the "0x" prefix: all the PPC assemblers (except StormPowerASM, which was falling really far from the PPC standards) are using this prefix instead of the "$" sign. This is what IBM/Motorola suggests in their docmentation also. Just matter of getting used to it.



tboeckel
Member

   Posted: 2006-Aug-10 09:27:23

What bothers me is exactly this:      it is hardly possible to overrun 200 chars in these functions.



Even if it is just hardly possible it is not impossible at all.

If you tell me that DisassembleNative/68k() only puts one single byte into the user supplied buffer I would not believe you. Sorry. This behaviour may be true for the current version, but what about the future? I treat every non-limiting string function as unsafe by default. Such behaviour is the cause of many buffer overflows which are really hard to spot sometimes.

All my project have switched to snprintf(), strlcpy() and strlcat() instead of sprintf(), strcpy() and strcat(). I am quite sure that all my buffers are largely oversized, but on the other hand even with random data of unpredictable size I never had a crash because of a buffer overflow.



abalaban
Member

   Posted: 2006-Aug-10 09:41:14
      If it overwrites the 200 chars then it will surely overwrite the EOS too... :/



Yes but that's why I add them *after* the call to DissassembleNative() this way even if it writes 8000 bytes (and in the eventuallity that it did not crash) I assure the printf statement that there will be at most 199 chars to write.



rachy
Flower Power! @-->--

   Posted: 2006-Aug-11 09:40:13 · Edited by: rachy

@tboeckel:

Come on, be reasonable. It is not possible that these routines would ever overrun of 50 chars, not even 200. These routines are not common printf functions where you can never know how long the result might be, we can exactly predict the maximum length of the output.

@abalaban:

Ok, point taken.



Hypex
Member

   Posted: 2006-Aug-12 18:24:09

@rachy
      You are absolutely right, but on the other hand: it is hardly possible to overrun 200 chars in these functions. There is no such opcode in PPC assembly which would be longer than 8 chars (would be lovely to have a 100 chars long opcode ;) neither the operands won't be ever longer than 20 chars at best. So don't be afraid of running over the bounds of your string.



With my finished output, 8 characters seems perfectly fine for a PPC opcode to be neatly aligned in a listng. I think some PPC, wait, PowerPC opcode mnemonics would end up at almost 100 chars long when extended to be full human readable/understandable. Sometimes I think of making a translating to 68k friendly look alike version, mr to move.l, li to move.l #n, lwz to move.l (Rx),x. Actually, I think I could just about translate every instruction to a move. One load becomes a move, a store becomes another move, or a load and store together could be optimised and become another move. I've got it sorted. :-)

By comparison, the old 8-bit MOS CPU's (MPU's ?) used a very neat three letter opcode, but it was fully understandable. LDA. STA. STX. JSR. RTS. RTI. And the operands would appear neatly in brackets. When I learned about the 68000, I thought all the code didn't look neat at all. MOVE.L D0,32(A4). That was unsightly, MOVE.L D0,(32,A4) please, or even MOVE.L D0,($0020,A4) to be slightly purist.
About the "0x" prefix: all the PPC assemblers (except StormPowerASM, which was falling really far from the PPC standards) are using this prefix instead of the "$" sign. This is what IBM/Motorola suggests in their docmentation also. Just matter of getting used to it.



Personally, I prefer to leave 0x to C programming and my $ to AmigaE, which accepts that as the hex prefix, and assembler. To me, it was one thing that seperated them and took less space. I always thought it was silly to have that 0 in front, there was also &H in AmigaBASIC, at the time. I also think of it as a "String" sign, not a dollar sign as it looks to everyone else around me. It's one anong many strange things to get used to on PPC programming, among them the strange one about the first bit on the left (MSB) of a (long) word being called bit 0 instead of bit 31; and the other end where 0 is now 31. Then that rotate instruction what takes a MSB and LSB as a bitmask somehow instead of a proper 32-bit full mask, but I guess it is useful. Then there is no normal JSR, RTS subroutine calling. And from looking at PPC code, no normalised compare then branch set. From what I see it's a compare, check something in some CR register, then branch with a minus sign, uh?. Doesn't PPC have a test instruction, set the zero flag in an operation, or similar to just use two instructions in sucsession?

It's all very strange to me, glad I'm just sticking to C at the moment for my new stuff, but would still be interested in what an x86 to PPC JIT routine would look like. Of course, still not knowing much about x86, and hating the shortcut MOVE to MOV or RET instead of RTS (what about RTN even - so much neater), that might not be a good idea.



rachy
Flower Power! @-->--

   Posted: 2006-Aug-13 19:12:39

@Hypex:

Have you ever heard of "BFEXTU", "NBCD" or "FETOXM1" opcodes? Those are for 68k... :)

And I wouldn't call the addressing modes of 68k easily understandable. Life begins with the extended addressing, such as:      MOVE.B ([disp,PC],D1.W*8,$10),([offs1,A1,D2.W*4],offs2)


Lovely, isn't it? ;)

The switched numbering of the bits is really annoying. (And also not exactly logical if we we look at the directions of the shifting instructions.) I have a special equipment for solving this issue: a checkered paper, which I can use for stepping in directions... :)



Hypex
Member

   Posted: 2006-Aug-14 15:19:07

@rachy
      Have you ever heard of "BFEXTU", "NBCD" or "FETOXM1" opcodes? Those are for 68k... :)



I think I have heard of BFEXTU and NBCD. Bit-Field Extend-Unsigned? Negate Binary Coded Decimal? NBCD, whatever it's meaning, wouldn't be that hard to remember. Don't know about the last one, floating point operation? These really sound like 68020+ op-codes, where things did get complicated.
And I wouldn't call the addressing modes of 68k easily understandable. Life begins with the extended addressing, such as:
MOVE.B ([disp,PC],D1.W*8,$10),([offs1,A1,D2.W*4],offs2)
Lovely, isn't it? ;)



Yes, that must have been the point where they lost it, those * multipliers really put it off. IMHO things like that shouldn't appear, it infers a shift operation or index and if so, should specify as such.

Yes, it is lovely, but, what does it do? Add all the operands up in the source and destination to calculate the effective addresses?
The switched numbering of the bits is really annoying. (And also not exactly logical if we we look at the directions of the shifting instructions.) I have a special equipment for solving this issue: a checkered paper, which I can use for stepping in directions... :)



Classic devices of measurement can always be relied upon.

Since it even makes it hard for you, I wonder why the bits were numbered like this, do they give any good reasons why?



rachy
Flower Power! @-->--

   Posted: 2006-Aug-15 10:01:47

@Hypex
      These really sound like 68020+ op-codes, where things did get complicated.



I would rather say: where 68k processors shaping up at last.
Yes, it is lovely, but, what does it do? Add all the operands up in the source and destination to calculate the effective addresses?



You must be kidding! The extended addresses help a lot for C compilers for indirect addressing, not to mention how much faster they are than calculating such a complex addressing via integer instructions. I used them in my programs many times, those are really handy.

Anyway, let's just conclude this topic: 68k assmebly programming can be as complex (if not even more) as PPC assembly programming. :)



abalaban
Member

   Posted: 2006-Aug-15 12:12:45

@reachy      68k assmebly programming can be as complex (if not even more) as PPC assembly programming. :)



Isn't why 68k were CISC ;-)



Hypex
Member

   Posted: 2006-Aug-15 16:29:27

@ rachy

How were 68k processors not shaping up? By comparison x86 never really made it past 8 main registers and that hasn't exactly hampered it's performance, these days it would probably be considered obsolete to maximise registers, as having lots of registers mattered more in the mid-eighties.

But hey, you could say that with the new Intel core-duo they finally have upgraded to 16 main registers! Yeah, and with everything else.

By the sounds of it, these C compilers must be optimised for 68k, or those special instructions. So, do C compilers really optimise for the particular CPU? Last I heard, things like GCC still didn't perfectly compile the fastest PPC assembly, there was a thread about this but forgot what it was about. The result being that a hand coded effort produced a ten-fold speed increase or something like that.



salass00
Member

   Posted: 2006-Aug-15 16:48:07

@Hypex

This one?

PPC (assembler) endian conversion routines



rachy
Flower Power! @-->--

   Posted: 2006-Aug-15 20:58:14 · Edited by: rachy

@Hypex
      But hey, you could say that with the new Intel core-duo they finally have upgraded to 16 main registers! Yeah, and with everything else.



680x0 was getting into shape at 68020, intel is getting into shape with AMD64... ;)
By the sounds of it, these C compilers must be optimised for 68k, or those special instructions. So, do C compilers really optimise for the particular CPU?



It depends on many factors: how the original C code was written, which compiler was used, which version was used, which optimalizations were turned on, which target processor was specified, which data model was specified, but yes, some of the C compilers applied extended addressing modes in certain cases.
Last I heard, things like GCC still didn't perfectly compile the fastest PPC assembly



GCC is not the only one C compiler, neither the best. You could gather better code in assembly than what a compiler might produce for special cases. (But this has nothing to do with the previous topic about 68k and extended addressing.)



Hypex
Member

   Posted: 2006-Aug-16 17:10:07 · Edited by: Hypex

@ salass00

Yeah, that's it, endian conversion.
Logged