Menu Sign In Contact FAQ
Banner
Welcome to our forums

Software nostalgia discussion

Yes it depends on the era.

When I started, people were coding not even always in asm but sometimes in hex. They left spaces for functions to expand! Stupid? Yes.

The IBM PC era saw very little asm for applications. I coded various DOS apps (including a nice terminal emulator, which is still used at work on test rigs) in asm. But most people were coding in C/C++ (mostly Borland but there were many others) plus Basic, etc, etc.

And everything after the 1980s was in C/C++. In mid 1980s at my then company I was doing a lot of asm but others all used IAR C, which was awful crap on speed; I used to hand code critical stuff in asm.

Today, asm is mostly gone. My current project, ARM32, uses asm in startup code (mostly stuff like filling BSS with 0×00 although you could do that in C but I inherited that, and in fact I do use C in other places like setting up a boot loader) and uses asm where it is impossible to avoid: FreeRTOS task switching code, and specific (minimum) timing delays.

Administrator
Shoreham EGKA, United Kingdom

Asm is unmaintainable by almost anyone coding today

For sure x86 assembler is. Back in the day when it was common to write assembler, the machine instructions were a lot simpler. I recently got hold of a PDP-11 assembler program I wrote in 1974, and it was crystal clear what it was doing. The 11 had instructions like MOV and ADD.

The x86 has instructions like VMADDMULRTQ7BXZ (OK, I made that one up). It’s anybody’s guess what they do. They are only meant to be produced by compilers.

I can even still read Elliott 903 assembler. That had the unique “feature” that there were no instruction names. You had to just know that 4 was load accumulator, 5 was store accumulator, etc. But since it only had 16 possible instructions, it wasn’t too hard.

LFMD, France

In that case you use __builtin_popcount().

Never knew this existed

the compiler can produce totally incomprehensible assembler that would be unmaintainable if it was source.

Asm is unmaintainable by almost anyone coding today But if you were coding in asm you would not do what compilers do, which tends to be a pretty random (though usually efficient, after optimisation) mess.

I don’t think we disagree on anything. The reason I am counter-arguing here is because I’ve spent a whole load of time dealing with weird stuff which compilers and linkers do and which is badly documented. A little example: a function in a lib (.a) cannot override a weak function in your code. Only a function in an object module (.o) can do such an override. Wasted a few days of my time, and it was only with the help of the EEVBLOG forum that I identified that. Compared with the time wasted on crap like that, I’ve found esoteric “optimised” code to be worthless, except for code size saving. This is a little sample:

-O0 produces 491k (no optimisation)
-Og produces 342k (general opt, just about usable for debugging)
-O1 produces 338k
-Os produces 305k (optimise for minimum size, at the expense of speed)

So, yeah, the basic optimisation levels are worth using.

Administrator
Shoreham EGKA, United Kingdom

Sure, but I was talking about doing it in C.

In that case you use __builtin_popcount().

The compiler cannot be more clever than the compiler writer.

Whilst that’s obviously true, the compiler can produce totally incomprehensible assembler that would be unmaintainable if it was source. Some of the SIMD examples he gives in the video illustrate perfectly what I mean.

LFMD, France

In my salad days, I used self-modifying code for mutexes or spinlocks, don’t remember all the details.

I also did quite a bit of reverse engineering on utilities and device drivers, and found an exotic RAM-saving trick somewhere in DEC RT-11: a successful completion of a command produced a message that something (don’t remember what) was CORRECT; if the command failed, the code would patch two bytes in the same message, changing CORRECT to CORRUPT.

Last Edited by Ultranomad at 26 May 12:29
LKBU (near Prague), Czech Republic

Self-mod code is quite esoteric

I never actually did that, but a candidate for it would be the Z80 indexing instructions like

ld d, (ix+23)

and the 23 was stored as a byte, and if this is in RAM then you could modify that byte. It doesn’t save any time really though.

Administrator
Shoreham EGKA, United Kingdom

Obviously there are other problems with asm too, like documentation (which coders hate doing, also for best job security, so asm is mostly unmaintainable).

I was at a fun retro event at the National Museum of Computing last weekend (an Econet lan party, Econet was Acorn’s networking in the 1980s, a low cost bus network based on the Motorola 68B54 ADLC) trying to reverse engineer some 6502 asm I wrote when I was 15. I think I spent most of the two days saying “This doesn’t make any sense at all!” – this mostly because back in the day, I wrote extremely bad 6502 asm. (Lots of self-modifying code for no good reason).

I did figure out what the code did in the end, but decided if I wanted to actually run the networking programs I’d written back then, rewriting them would probably be the best option!

Last Edited by alioth at 26 May 10:25
Andreas IOM

Hard to beat the POPCNT instruction (ARM has one too).

Sure, but I was talking about doing it in C.

The challenge of writing in C and expecting the compiler to recognise the code as e.g. counting 1s, and drop in POPCNT or whatever, is that it is

  • code style dependent
  • compiler version dependent

and if the code really is critical then a compiler upgrade could break it. In your firm you need to be very careful with compiler upgrades, and have a regression test suite.

The compiler produces code which is way better than anything you would do by hand.

That, however, does not withstand logical scrutiny The compiler cannot be more clever than the compiler writer.

It’s a hollow argument though because while asm written by somebody clever will (must) always outperform a compiler, you will never get anything finished these days. I was doing asm for about 30 years and the only reason I got some (very good) stuff done is because in the old days products didn’t need to be so sophisticated. Today, you might spend 3 months coding the functionality and then 10 man-years coding the connectivity (ethernet, tcp/ip, tls, etc). So you need libs for the latter… another debate. Obviously there are other problems with asm too, like documentation (which coders hate doing, also for best job security, so asm is mostly unmaintainable).

Administrator
Shoreham EGKA, United Kingdom

And probably the fastest would be a lookup table for each byte, and add them all up.

Hard to beat the POPCNT instruction (ARM has one too).

Intel has instructions for the common crypto stuff too.

I’ve spent the last ten years working on super-high-performance network stuff – we do intensive packet processing at 10G bits/sec. It’s not trivial and there is a huge amount of work gone into performance (e.g. ensuring everything is in L1 cache when needed, avoiding locks and atomic operations in the data path). There is exactly one place where we use a dozen lines of assembler. Everything else is in C++. The compiler produces code which is way better than anything you would do by hand.

LFMD, France

I thought that application used a lot of FPGAs, which is far faster than software could ever be. Is that the one where all users in a building are fed via fibres cut to equal length so that all the dealers get the price feed at the same time (within nanoseconds)? There was a guy on here who works/worked in that business but I think he’s gone.

One challenge in cunning optimisations is that you have to write the C code such that the compiler recognises the pattern. The video presenter came across one such example in GCC – the weird optimisation where you concurrently test for four possible bytes by using a side effect of a 32 bit comparison. In GCC it was dependent on the order in which the four values were listed. That makes it sensitive to coding style. And if you are going to accept a coding style dependency as the price of getting the highest performance, why not just use good old assembler e.g.

Same with other stuff like counting all the “1” bits in a word. One could write that function in lots of different ways. And probably the fastest would be a lookup table for each byte, and add them all up.

Crypto is another area I have spent time on and in general it is addressed with lookup tables, which are sometimes huge (megabytes). All the “s-box” based ciphers benefit from tables.

Another thing, relevant to management of projects where the product has a long market life but may need a revisit periodically, is that if your code relies on shaving off every last cycle to function, you need to archive not just the source but also the tools – probably in a VM. But almost nobody does that.

IMHO effort should go into improving the absolutely horrible linker script syntax I’ve wasted days on that stuff.

Administrator
Shoreham EGKA, United Kingdom
35 Posts
Sign in to add your message

Back to Top