PDA

View Full Version : Code optimisations - updated 09/09/10


Pages : [1] 2

Y_Less
03/12/2008, 03:07 PM
Code optimisation

Contents


Contents
Introduction
String optimisation
State machines (automata)
Callback hooks
Interesting Macros
Testing

foreach

Minor optimisations

IEEE numbers
Rearranging
Data rearrangements
Know the language
Distance checks
Speed order
Know your values
Return values
Small snippets

Equivalence
Empty strings
Copying strings


Assumptions

Introduction
Solutions

Ignore it
Make modifications easy
Code defensively

Important

Memory reduction

All vehicles
All attributes
More that 32 values
Excess dimensions

CPU vs Memory
Lists

Types
Mixed lists
Code

Binary trees

Example
Balanced and unbalanced
Modification

Addition
Deletion




Introduction

These are just some techniques for making your code faster that I have picked up up on the way. Please note that I in no way pretend to be an authority on this subject, these are just what I know, others may know other methods, in which case please do share them as even if no-one else cares I would be interested in knowing them. Note also that many of these techniques apply to languages beside PAWN (the last one I wrote was accused of being stupid because PAWN does not have dynamic memory allocation (personally I think that's a good thing but there we go)), however a lot of other languages may incorporate bits of these into the compiler to do in-line optimisation.

This does not promise to make your code good, just hopefully better. Also note that some bits are fairly complex, it's aimed as a more advanced tutorial, so it does make some assumptions about knowledge in some areas.

String optimisation

For those of you who haven't read my thread on better string usage, it's here:

Why you shouldn't make your strings 256 cells big (http://forum.sa-mp.com/showthread.php?t=55261)

State machines (automata)

For those of you who haven't read my thread on state machines (automata), it's here:

State machines (automata) (http://forum.sa-mp.com/showthread.php?t=86850)

Callback hooks

For those of you who haven't read my thread on callback hooking for libraries, it's here:

Simpler library writing and usage (http://forum.sa-mp.com/showthread.php?t=85907)

Interesting Macros

This topic gives some very interesting uses for macros to make complex code simpler:

Interesting macros (http://forum.sa-mp.com/showthread.php?t=103650)

Testing

Firstly I'm going to explain my testing procedure. If I have two pieces of code which do the same thing but in different ways and I want to know which is faster I clock them:


#define CODE_1 printf("%d", 42);
#define CODE_2 new str[4]; format(str, sizeof (str), "%d", 42); print(str);
#define ITERATIONS (10000)

Test()
{
new
t0,
t1,
t2,
i;
t0 = GetTickCount();
for (i = 0; i < ITERATIONS; i++)
{
CODE_1
}
t1 = GetTickCount();
for (i = 0; i < ITERATIONS; i++)
{
CODE_2
}
t2 = GetTickCount();
printf("Time 1: %04d, time 2: %04d", t1 - t0, t2 - t1);
}


Clearly both the pieces of code will display the number "42" in the server console, however they both do it in different ways. Hopefully no-one will need to run this code to know which method will be faster, but it's a good simple example of testing to see which of two equivalent pieces of code is faster. The ITERATIONS loop is important, in all likelihood both of these pieces of code will take less than a millisecond each, so both will report their time taken as zero. Also, if you only do it once threading becomes a major issue, if one version is interrupted by the OS it can report itself as taking substantially longer when in fact it's faster. If both are done lots and lots of times then interrupts will hopefully negate each other and each loop will take more than one millisecond. The layout of the code is also important, all the variables are declared first to move their overhead outside of the loop (their execution time is so small it likely won't affect the outcome anyway, but just for consistency it's good).

It's also sometimes good to wrap the Test function in another loop, especially if the results are very close, to verify the results. Execution times can vary slightly due to threads, this is visible if you run multiple tests in the form of maybe a few milliseconds variation on each time. If you run close results repeatedly then you can check than one is consistently faster rather than faster just the once, as that may have been a fluke and 90% of the time the other is faster, just not that one time.

Sometimes you may need more advanced test code, e.g. to test more than two equivalents, this is fairly easy to expand:


#define CODE_1 printf("%d", 42);
#define CODE_2 new str[4]; format(str, sizeof (str), "%d", 42); print(str);
#define CODE_3 print("42");
#define ITERATIONS (10000)

Test()
{
new
t0,
t1,
t2,
t3,
i;
t0 = GetTickCount();
for (i = 0; i < ITERATIONS; i++)
{
CODE_1
}
t1 = GetTickCount();
for (i = 0; i < ITERATIONS; i++)
{
CODE_2
}
t2 = GetTickCount();
for (i = 0; i < ITERATIONS; i++)
{
CODE_3
}
t3 = GetTickCount();
printf("Time 1: %04d, time 2: %04d, time 3: %04d", t1 - t0, t2 - t1, t3 - t2);
}


Etcetera...

foreach

I recently clocked my foreach function against the default for/IsPlayerConnected (IPC) code. foreach uses a linked list of players so when you loop it ONLY loops through connected players, compared to IPC code which loops through ALL players and checks if they're connected. I knew that on a large server with not many players it was faster, but I wasn't sure about more full servers, so I clocked it. I didn't have 200 players to test with but I know that IsPlayerConnected takes pretty much the same time to run whether it returns true or false (it basically just returns a variable in the server which could be either), so this wasn't a problem as for a given number of players IPC would run at a constant speed regardless of whether they were on or not. foreach runs very differently depending on how many players are connected, taking next to no time with no players and a lot longer for a full server, I just wanted to know if this longest execution time was longer than the constant time IPC ran at. I actually wanted to profile it at all player counts, this meant faking player connections in foreach, which wasn't hard as it's my code and you just call a connect function. The test code for this ended up looking like:


#define FAKE_MAX 200
#define SKIP 0
Iter_Create(P2, FAKE_MAX);

TestFunc()
{
new
fep = 0,
fet = 0,
fip = 0,
fit = 0,
i = 0;
while (i < SKIP)
{
Itter_Add(P2, i++);
}
while (i <= FAKE_MAX)
{
new
t0,
t1,
t2,
j;
t0 = GetTickCount();
for (j = 0; j < 10000; j++)
{
for (new playerid = 0; playerid < FAKE_MAX; playerid++)
{
if (IsPlayerConnected(playerid))
{
// Do something
}
}
}
t1 = GetTickCount();
for (j = 0; j < 10000; j++)
{
foreach(P2, playerid)
{
// Do something
}
}
t2 = GetTickCount();
printf("Players: %04d, for: %04d, foreach: %04d", i, t1 - t0, t2 - t1);
fit = fit + t1 - t0;
fet = fet + t2 - t1;
fip += FAKE_MAX;
fep += i;
if (i < FAKE_MAX)
{
Itter_Add(P2, i);
}
i++;
}
printf("for ms/p: %04d, foreach ms/p: %04d", (fit * 100) / fip, (fet * 100) / fep);
}


This ran the code 201 times, one for each player count (0-200) (with both foreach and IPC it doesn't matter WHICH players are connected in terms of speed). It also allowed me to fake more players, e.g. to test how this code would run on a 0.3 server with 500 players, IPC is actually slightly faster if the player you're testing doesn't exist, and it was STILL slower, so that was fairly conclusive. The one test I didn't run was:


t0 = GetTickCount();
for (j = 0; j < 10000; j++)
{
for (new playerid = 0; playerid < FAKE_MAX; playerid++)
{
Kick(playerid);
}
}
t1 = GetTickCount();
for (j = 0; j < 10000; j++)
{
foreach(P2, playerid)
{
Kick(playerid);
}
}
t2 = GetTickCount();


Kick, and in fact all player functions, has an internal IsPlayerConnected check, so if you're only running one function in a loop it's more efficient to NOT call IsPlayerConnected and just call the function direct. If the player is connected you've saved a function call, if they're not connected you've not lost anything as the only code that's been executed is the same as if you called IsPlayerConnected. Unfortunately this example may affect the speed of foreach at high player counts as you will be using the foreach code AND the IPC code in the same loop. If you did:


t0 = GetTickCount();
for (j = 0; j < 10000; j++)
{
for (new playerid = 0; playerid < FAKE_MAX; playerid++)
{
if (IsPlayerConnected(playerid))
{
Kick(playerid);
}
}
}
t1 = GetTickCount();
for (j = 0; j < 10000; j++)
{
foreach(P2, playerid)
{
Kick(playerid);
}
}
t2 = GetTickCount();


Then foreach would be faster, but that's not the most efficient way of doing it in the first instance.

I have actually looked into modifying a compiler so you can do something like:


eqiv
{
{
for (new playerid = 0; playerid < FAKE_MAX; playerid++)
{
if (IsPlayerConnected(playerid))
{
Kick(playerid);
}
}
}
{
foreach(P2, playerid)
{
Kick(playerid);
}
}
}


Which will compile and optimise both versions of the code, then accurately clock both based on generated code and known OpCode clock cycles to see which is faster. However there are all sorts of problems involved regarding test sets and I've not done it yet so it's really a moot point (truthfully there's a number of improvements I'd like to make to various language compilers, I've just not yet).

Minor optimisations

IEEE numbers

There is a float representation for positive and negative infinity, and one for invalid numbers. I've seen people try all sorts of representations for large or invalid float numbers, take these examples:


if (!IsPlayerConnected(playerid))
{
return -1.0;
}
return GetDistance(playerid);



new
bool:first = true,
Float:distance = 0.0;
foreach (Player, playerid)
{
if (first)
{
first = false;
distance = GetDistance(playerid);
}
else
{
new
Float:temp = GetDistance(playerid);
if (temp < distance)
{
distance = temp;
}
}
}


The first code could do anything, with a distance of -1.0 meaning the player isn't connected, so retuning an invalid distance. The second code finds the closet player to something (exact details aren't important). The second piece of code can be optimised by choosing a very large start number instead of the "first" variable:


new
Float:distance = 100000.0;
foreach (Player, playerid)
{
new
Float:temp = GetDistance(playerid);
if (temp < distance)
{
distance = temp;
}
}


But that misses the rare case when a player is over 100000 units away (note that in actual fact you would use squared values here, as detailed below, but this is for example only). Both of these examples have well defined solutions:


#define FLOAT_INFINITY (Float:0x7F800000)
#define FLOAT_NEG_INFINITY (Float:0xFF800000)
#define FLOAT_NAN (Float:0xFFFFFFFF)


This would make the code above:


new
Float:distance = FLOAT_INFINITY;
foreach (Player, playerid)
{
new
Float:temp = GetDistance(playerid);
if (temp < distance)
{
distance = temp;
}
}



if (!IsPlayerConnected(playerid))
{
return FLOAT_NAN;
}
return GetDistance(playerid);


"NaN" is a very special number - comparing it to any other number (including itself) will return false. To check for NaN you would do:


stock IsNaN(number)
{
return !(number <= 0 || number > 0);
}


That function returns false if the number passed is less than, equal to, or greater than 0 - all numbers, including + and - infinity, match those criteria - NaN DOESN'T because it's not a number, so doesn't have a value. Using these values guarantee (they're defined in the IEEE floating point number spec) that you will never use numbers which could be confused with real values. Due to the unique properties of NaN this very odd looking code should also work:


stock IsNaN(number)
{
return (number != number);
}


DO NOT TRY:

if (number == FLOAT_NAN)

That code will fail because, as previously mentioned, NaN is not equal even to itself.

Let's look at this when you want the distance to see if they're in range of something:


if (DistanceWithConnectionCheck(playerid) < 100.0)
{
// They're in range and connected
}


Returning infinity will mean the check fails, as will returning NaN (it's not less than, greater than or equal to 10) - compare this to the extra code you need here to check for "-1.0":


new
Float:distance = DistanceWithConnectionCheck(playerid);
if (distance == -1.0)
{
// They're not connected
}
else if (distance < 100.0)
{
// They're in range and connected
// -1.0 is less than 100, so we need to check for that specially
}


See the YSI object streamer for a real usage.

Rearranging

The compiler can do constant maths, this means if you do:


printf("%d", 4 + 5);


The compiler will do:


printf("%d", 9);


It won't bother putting in code to do the maths as there's no point - it'll always be the same result. Often a simple formula rearrangement can help your code:


new
var = (4 + somevar) - 11;


That will compile to do two bits of maths, first to add 4 to a number, then to subtract 11 from the result (some compilers may actually be able to optimise this in the way I'm about to describe, but it's a simple example). If you rearrange this sum you get:


new
var = somevar + (4 - 11);


The compiler can very quickly optimise this to:


new
var = somevar - 7;


A more complex example with no compiler optimisations:


new
gLastTime[MAX_PLAYERS];

#define EXPIRY 1000



new
time = GetTickCount();
foreach (Player, playerid)
{
if (time - gLastTime[playerid] > EXPIRY)
{
SendClientMessage(playerid, 0xFF0000AA, "Your time expired");
}
}


That's a basic example which detects when a certain time has passed since a player last did something. No options for optimisation there you may think, and you may be right, but it's always better to try. The equation here is:

time - gLastTime[playerid] > EXPIRY

=, ==, >= etc can all be rearranged in the same way, so the above equation is the same as:

time > EXPIRY + gLastTime[playerid]

More importantly, it is also the same as:

time - EXPIRY > gLastTime[playerid]

Why is this important? In terms of the loop "time - EXPIRY" is now a constant as neither change in the loop. This means you can do:


new
time = GetTickCount() - EXPIRY;
foreach (Player, playerid)
{
if (time > gLastTime[playerid])
{
SendClientMessage(playerid, 0xFF0000AA, "Your time expired");
}
}


You have just cut out up to 200 repeated subtractions with basically no effort. The more constant, or pseudo-constant, elements you can get in a sum the better, especially when you're doing lots of them. If you were only checking one player I'd be tempted to put everything, including the GetTickCount() call, in the if statement, but not if you're doing it multiple times.

Data rearrangements

Another thing to consider is how your data is laid out and how you want to access it. Take the following data for example:


#define MAX_OWNDED_VEHICLES 10

new gVehicleOwner[MAX_OWNDED_VEHICLES] = {0, 2, 4, 6, 8, 10, 12, 14, 16, 18};


Here you have 10 vehicles, each with a player who owns them (assume for this example one player can only own up to one vehicle). If you want to find out who owns a vehicle you can simply do:


printf("The owner of vehicle %d is %d", vehicleid, gVehicleOwner[vehicleid]);


But what if you want to find out which vehicle a player owns? For that you would need to do something like:


new i = 0;
while (i < MAX_OWNDED_VEHICLES)
{
if (gVehicleOwner[i] == playerid)
{
printf("Player %d owns vehicle %d", playerid, i);
break;
}
i++;
}
if (i == MAX_OWNDED_VEHICLES)
printf("Player %d does not own a vehicle", playerid);


Now lets look at it a different way round:


#define MAX_PLAYERS 20

new gPlayerVehicle[MAX_PLAYERS] = {0, -1, 1, -1, 2, -1, 3, -1, 4, -1, 5, -1, 6, -1, 7, -1, 8, -1, 9, -1};


Now if you want to find out which vehicle a player owns it's just a simple array lookup, but if you want to see who owns a vehicle it's a loop. The question you need to consider here is which do you want to know more? If you don't care who owns a given vehicle but do care which vehicle someone owns then use the second layout, and vice-versa the first. If you use both a lot you may actually want to consider mirroring the data in two different arrays. This is a trade-off between speed and memory, and is a VERY common trade-off you find people dealing with. Personally I think speed is more important than memory in modern 32 and 64 bit processors so would use two arrays, but you may disagree given the multiple gigahertz that they run at so would use one array and a loop.

Let's look at a far more clean-cut example that was actually in a topic I read recently. This example is VERY cut down, I'm only using 10 models here:


new
gCars[] = {400, 403, 404, 406, 408, 409},
gHeavyVehicles[] = {400, 402, 408},
gBoats[] = {401, 407},
gFireEngines[] = {402, 405};


If you want to know if a model is a car you need to loop through "gCar" till you find that model or you reach the end. On the other hand with this code it's very easy to tell what model a car at a given position is, but this means nothing as is data you are unlikely to ever want to know. So the question is; why is it easy to get data you don't want and hard to get data you do want? That makes no sense at all... We know that for a given model we want to know if it's a car, so we need to change the code to use the model as the array index (offsetting by 400), same as we did above to find what vehicle a player owns:


new
gIsACar[] = {1, 0, 0, 1, 1, 0, 1, 0, 1, 1},
gIsAHeavyVehicle[] = {1, 0, 1, 0, 0, 0, 0, 0, 1, 0},
gIsABoat[] = {0, 1, 0, 0, 0, 0, 0, 1, 0, 0},
gIsAFireEngine[] = {0, 0, 1, 0, 0, 1, 0, 0, 0, 0};


Now if you want to find out if a model is a car or not you simply do:

if (gIsACar[model - 400])

Isn't that SO much simpler and faster than a loop?

Using an entire cell to store a boolean value (1 or 0) is also very inefficient, but we'll cover that later.

Know the language

And I mean WELL. As someone reminded me the other day (I had commented on it on IRC) a little while ago I found out that:


if (2 <= a <= 4)


Works in PAWN (it doesn't in C), I had thought it was like in C, so had been doing:


if (2 <= a && a <= 4)


Not a vast improvement, but most of these aren't, it's the combined and repeated effect that's important.

If, for example, you didn't know about the "&" operator and you wanted to test if the second bit of a number was set you would need to do something like:


if ((a << 30) >>> 31)


or:


if ((a % 4) >>> 1)


Both those pieces of code would ensure that only one bit of a number was set, and would do what you wanted, but it's far better to know about "&":


if (a & 2)


It's clearly faster (shifts aren't too bad, but MOD is VERY slow, out of the other two versions always go for the first if you have to go for one of them), it's also a lot more obvious what you're trying to do. If you don't know what it does - go read pawn-lang.pdf.

This is clearly a very basic example but there are many many other examples. People tend to baulk when I tell them to read pawn-lang.pdf then wonder why I seem to know more about PAWN than they do, 2 + 2 = ... There's a reason I have it bookmarked, and it's not so I can quickly copy the link to post for people (although it is handy for that too).

As my example at the start of this section shows, there's always something new for you to learn, no matter how much you may already know. I can think of two huge parts of PAWN that I don't know at all and the fact that I don't know of any other areas just means I don't know them yet, it doesn't mean they don't exist (it's hard to know what you don't know).

Distance checks

0.3 now supports distance checks automatically, just use:

IsPlayerInRangeOfPoint(playerid, Float:range, Float:x, Float:y, Float:z);

This is a very common thing for people do, and a very common thing for people to do wrong. Example:


if (PlayerDistanceToPoint(playerid, 10.0, 20.0, 2.0) <= 5.0)
{
// They're near 10.0, 20.0, 2.0 - do something
}


That code will work, will trigger when the player is in 5.0 units of a point, but getting the distance to a point is very slow. The equation to get the distance between two points (x1, y1, z1 and x2, y2, z2) is:

(((x1 - x2) ^ 2) + ((y1 - y2) ^ 2) + ((z1 - z2) ^ 2)) ^ 0.5

"^" in this case means power, not XOR, "^ 0.5" is square root (trust me - try it on a calculator). The most common implementation of this is:

floatsqroot(floatadd(floatadd(floatpower(floatsub( x1, x2), 2), floatpower(floatsub(y1, y2), 2)), floatpower(floatsub(z1, z2), 2)));

Don't ask me why whoever originally wrote it didn't bother using standard operators, I have no idea, but simplified this code is:

floatsqroot(floatpower(x1 - x2, 2) + floatpower(y1 - y2, 2) + floatpower(z1 - z2, 2));

Now, the first thing to note is that the code to raise something to a generic power is complicated, it doesn't optimise for simple ones like 2 it just uses the basic algorithm. We know that something to the power 2 is just that thing multiplied by itself (or you should do). 3^2 is the same as 3*3, 57^2 is the same as 57*57 etc. Multiplication is much simpler that power, so the code becomes:

floatsqroot(((x1 - x2) * (x1 - x2)) + ((y1 - y2) * (y1 - y2)) + ((z1 - z2) * (z1 - z2)));

You could remove some brackets, but there's no point, this is just as fast as the reduced bracket version and more explicit. There is now more code here, but it's much faster. Now the quest for optimisation gets more interesting. We do each of the subtractions twice, you could export these to variables to only do them once, but then you've got additional variable writes which may not offset the gain in speed:


x1 -= x2;
y1 -= y2;
z1 -= z2;
floatsqroot((x1 * x1) + (y1 * y1) + (z1 * z1));


We now have our nice efficient code for getting someone's distance from something, but there's still one major problem - all the optimisations done so far are nothing compared to floatsqroot, that is an insanely inefficient function (well, it's not inefficient, it's actually very efficient, but that doesn't make it fast because it's so complicated). Believe it or not most of the time you don't actually need to know exactly how far from the point someone is, just whether they're near the point or not. Now you should have read the part on rearranging (if not, go read it again), so let's apply that here:

((x * x) + (y * y) + (z * z)) ^ 0.5 <= 5.0

You can rearrange inequalities in exactly the same way as regular equations:


((x * x) + (y * y) + (z * z)) ^ 0.5 <= 5.0
(x * x) + (y * y) + (z * z) <= 5.0 ^ 2
(x * x) + (y * y) + (z * z) <= 5.0 * 5.0


Anyone familiar with maths should be able to vouch for that very simple rearrangement. We know how to quickly square something (as I just told you), and we know that square-rooting is very slow, so that's a vast improvement:


if (PlayerDistanceToPointSquared(playerid, 10.0, 20.0, 2.0) <= 5.0 * 5.0)
{
// They're near 10.0, 20.0, 2.0 - do something
}


Or:


if (IsPlayerInRangeOfPoint(playerid, 5.0, 10.0, 20.0, 2.0))
{
// They're near 10.0, 20.0, 2.0 - do something
}



stock IsPlayerInRangeOfPoint(playerid, Float:range, Float:x, Float:y, Float:z)
{
new
Float:px,
Float:py,
Float:pz;
GetPlayerPos(playerid, px, py, pz);
px -= x;
py -= y;
pz -= z;
return ((px * px) + (py * py) + (pz * pz)) < (range * range);
}


You can write your own code of course, but for reasons I can't go into I STRONGLY suggest you use that function name and parameter order.

Update:

I've found the original timing analysis I did on this area and the results are not what some people would expect. I compared a whole load of different distance analysis functions, including less accurate ones people use for speed, for example:


Type1(Float:x1, Float:y1, Float:z1, Float:x2, Float:y2, Float:z2, Float:dist)
{
x1 = (x1 > x2) ? x1 - x2 : x2 - x1;
if (x1 > dist) return false;
y1 = (y1 > y2) ? y1 - y2 : y2 - y1;
if (y1 > dist) return false;
z1 = (z1 > z2) ? z1 - z2 : z2 - z1;
if (z1 > dist) return false;
return true;
}


The results were that even these "faster" implementations were slower than the implementation outlined above, AND less accurate.

The results I got were:

1703 1781 1594 1641 2265 1782 2281 1891

1703 is the time for the "faster" version, 1594 was the time for my version. Conclusion - don't try to optimise distance checks by making them worse - the originals are both faster and more accurate.

Note that running the linked code will produce 9 values due to a bug - just ignore the last value (a fatal mistake I made).

My analysis code can be found here (http://y-less.pastebin.ca/1299061).

Speed order

Different language features take different times to execute, in general the order is (from fastest to slowest):


Nothing
Constants
Variables
Arrays
Native functions
Custom functions
Remote functions


So for example:


for (new i = 0; i < MAX_PLAYERS; i++)


Is faster than:


for (new i = 0, j = GetMaxPlayers(); i < j; i++)


As the main part of the loop in the first uses a constant, whereas the main part in the second uses a variable (the overhead of a single function call in a loop is negligible compared to the repeated check). This second version is itself faster than:


for (new i = 0; i < GetMaxPlayers(); i++)


As this third version uses a repeated function call rather than a variable or constant.

I'm not sure where control structures fit in to the list, for example I'm not sure which of these is faster:


new var = a ? 0 : 1;
printf("%d", var);
printf("%d", var);
printf("%d", var);
printf("%d", var);
printf("%d", var);



printf("%d", a ? 0 : 1);
printf("%d", a ? 0 : 1);
printf("%d", a ? 0 : 1);
printf("%d", a ? 0 : 1);
printf("%d", a ? 0 : 1);


I suspect the first, unless you only have one print, in which case definitely the second, but again there are for more complex examples where it's less clear. This requires clocking but I've not done it yet (and there are loads of control structures, so I just apply my general rule (see below) to these).

So why is "nothing" in the list? Consider the following two bits of code:


new var = random(10);
printf("%d", var);



printf("%d", random(10));


Clearly the second is faster as there's no intermediary step. In actual fact most compilers will optimise out the variable but not all do.

Where the line starts getting blurry is with repeated calls. For example, which of these is faster:


new var = gArr[10];
printf("%d %d", var, var);



printf("%d %d", gArr[10], gArr[10]);


The first one has one array access, one variable write and two variable reads, the second has two array accesses. Truthfully I'm not sure which is faster but I have a general rule:

More than one function call - save it in a variable, more than two array reads save it in a variable, so for the above code I would probably use the second version, however a three element print would require three array accesses, which is more than two, thus I would use an intermediary variable:


new var = gArr[10];
printf("%d %d %d", var, var, var);


And I never call the same function more than once when I don't have to (especially as this can have unintended results if the function has changing returns or side effects).

Know your values

Another common bit of code (I'm sure most of you will recognise it) is this:


if (killerid == INVALID_PLAYER_ID)
SendDeathMessage(INVALID_PLAYER_ID, playerid, reason);
else
SendDeathMessage(killerid, playerid, reason);


Lets look at another example of similar code to try make it more obvious what exactly this is doing and why it's stupid:


if (var == 1)
printf("%d", 1);
else
printf("%d", var);


If var is one this prints '1', if var isn't one this prints the value of var, either way the value of var gets printed, so what does the if do? This can just as easily be written:

printf("%d", var);

For the same reason, this does the same as the code above:

SendDeathMessage(killerid, playerid, reason);

The original code comes from the 0.2 version of LVDM, but in there other things were done too, making the check not pointless, but people took this, removed the other bits and didn't think about what the code was actually now doing. It doesn't matter if killerid isn't valid as as shown above INVALID_PLAYER_ID is a perfectly acceptable input to SendDeathMessage.

Return values

Contrary to what the wiki says, many function's return values are important as they indicate whether a (native) function succeeded or not. This can be utilised as we know from before that variables are faster than function calls, and doing things once is faster that doing things twice. An example:


new Float:health;
for (new i = 0; i < MAX_PLAYERS: i++)
if (IsPlayerConnected(i))
{
GetPlayerHealth(i, health);
SetPlayerHealth(i, health + 10.0);
}


Pretty self explanatory and typical code, but if we understand error returns this can be optimised. Due to bugs in 0.1 all player and vehicle functions now have checks to make sure the thing you’re operating on actually exists, so if you did GetPlayerHealth on a player that didn’t exist the function would fail and the health variable would have the same value as before. More importantly pretty much all functions without an important return value return 0 on failure and 1 on success. Internally the player connection check in GetPlayerHealth is more or less identical to the one in IsPlayerConnected, so we’re checking if a player is connected twice. If they’re not connected GetPlayerHealth will end instantly, so instead of the above code we can do:


new Float:health;
for (new i = 0; i < MAX_PLAYERS: i++)
if (GetPlayerHealth(i, health))
SetPlayerHealth(i, health + 10.0);


This is no slower for players not connected and faster for players who are, so worst case you get no improvement, best case loads.

This can also be applied to other functions, even if they have return values:


if (IsPlayerInAnyVehicle(playerid))
{
new vehicleid = GetPlayerVehicleID(playerid);
SetVehiclePos(vehicleid, 0.0, 0.0, 10.0);
}


Again, an example of very common code, but again we need to ask what do these functions actually return? GetPlayerVehicleID returns the ID of the vehicle the player is in, and if they’re not in a vehicle it returns 0 (as this is an invalid vehicle ID). So, if we’re going to get the ID of the vehicle they’re in and this knows if they’re in a vehicle or not and can tell you, why check if they’re in one?


new vehicleid = GetPlayerVehicleID(playerid);
if (vehicleid)
SetVehiclePos(vehicleid, 0.0, 0.0, 10.0);


Now, instead of two function calls you only have one and you check the return of that one is valid (i.e. not 0).

One other aspect of return values is how long they exist for. If you set a variable in an if statement (which, if you’re not careful, will give an unintended assignment warning) the value you just set is still in the if statement, so if you did:


new
a = 1,
b = 0;
if ((b = a))


(Note the double brackets to avoid the unintended assignment warning)

Then "a" would be assigned to "b", so "b" would be 1, and that 1 would still be active in the if effectively as the return of the assignment, so this if is true, however:


new
a = 0,
b = 1;
if ((b = a))


After this code "b" will be 0, as "a" has successfully been assigned to "b", but as the result of this assignment is 0, the if fails. Just to illustrate this, figure this code out (re-read the part on strings if you need to):


stock strcpy(dest[], src[])
{
new i = 0;
while ((dest[i] = src[i])) i++;
}


This also means you can do:


new vehicleid;
for (new i = 0; i < MAX_PLAYERS: i++)
if ((vehicleid = GetPlayerVehicleID(i)))
SetVehiclePos(vehicleid, 0.0, 0.0, 10.0);


Example of a player name check implementation in PAWN using this:


NameCheck(name[])
{
new
i,
ch;
while ((ch = name[i++]) && ((ch == ']') || (ch == '[') || (ch == '_') || ('0' <= ch <= '9') || ((ch |= 0x20) && ('a' <= ch <= 'z')))) {}
return !ch;
}


This function will return true if the name is OK (i.e. all 0-9, a-z, A-Z, [, ] or _) and false if not. The (ch |= 0x20) is another little trick to convert a character to lower case, whatever it's previous case, based on the fact that in ASCII upper and lower case characters are exactly 0x20 apart (A = 0x40, a = 0x60).

Small snippets

Equivalence

Multiple things can mean the same thing, for example:

if (string[1] == 65)

Is the same as:

if (string[1] == 0x41)

Is the same as:

if (string[1] == 0b01000001)

Is the same as:

if (string[1] == 'A')

Although they all mean the same thing and thus won't get any speed improvements the last one is more obvious as to what you're trying to do (check if a character is 'A'). However in other circumstances one of the other two may be more appropriate, i.e. you want to see if something is 65, and the fact that that's the same as 'A' is merely coincidence.

This can be taken a bit further with zero:

if (string[1] == 0)

Is the same as:

if (string[1] == 0x00)

Is the same as:

if (string[1] == ((27 + 3) / 5) - 6)

Is the same as:

if (string[1] == 0b00000000)

Is the same as:

if (string[1] == '\0')

Is the same as:

if (string[1] == false)

Is the same as:

if (string[1] != true)

Is the same as:

if (string[1] == 0.0)

Is the same as:

if (!string[1])

In fact that can be taken a lot further (floatround_round, radian, SPECIAL_ACTION_NONE, seek_start, io_read and PLAYER_STATE_NONE are all also 0).

Empty strings

The standard way of checking if a string is empty is:


if (strlen(string) == 0)
{
// The string is empty because it's length is 0
}


This clearly does check if a string is empty, and is fast if the string is empty, but is slower if the string isn't empty. The way strlen works is by looping through the string until it finds the end (which, as you should know from the other topic on strings, is denoted by a NULL character), once the end is hit the length is returned, so if the string isn't empty it will take a while to find the end.

As we know that the end of a string is denoted by a NULL character all we need to do to see if a string is empty is to see if the first character is the end of the string or not:


if (string[0] == '\0')
{
// The string is empty because it's first character is the end
}


This can be further improved:


if (!string[0])
{
// The string is empty because it's first character doesn't exist
}


The only place this doesn't apply is in strings passed by CallRemoteFunction and CallLocalFunction. Due to the way the PAWN VM works these strings MUST NOT have 0 length, so empty strings are passed as "\1\0" (i.e. character 1 (SOH), character 0 (NULL)). To check for this do:


if (string[0] == '\1' && string[1] == '\0')
{
// The string is empty because it's specially marked as empty
}


Or, using isnull from YSI:


#define isnull(%1) \
((!(%1[0])) || (((%1[0]) == '\1') && (!(%1[1]))))



if (isnull(string))
{
// The string is empty because isnull said so
}


Credit to Simon for the suggestion. (http://forum.sa-mp.com/index.php?topic=79810.msg525891#msg525891)

Copying strings

A lot of people tend to copy strings like this:

format(dest, sizeof (dest), "%s", src);

This is one of the worst ways to do it! I did timings on six different methods of copying strings, in all cases "b" is the destination and "a" is the source. "strcpy" is a hand written PAWN function to copy strings:


strmid(b, a, 0, strlen(a), sizeof (b));

format(b, sizeof (b), "%s", a);

b[0] = '\0';
strcat(b, a, sizeof (b));

memcpy(b, a, 0, strlen(a) * 4 + 4, sizeof (b)); // Length in bytes, not cells.

strcpy(b, a);

b = a;


Note that "b = a;" is the standard PAWN array copy and only works for arrays known at compile time to be the same size, or with a larger desination. Unfortunately I ran a range of tests and they do not point to a single best function. What they DO do is show quite clearly that both the hand coded PAWN version and format are very slow at copying strings:

For short strings in small arrays, "b = a;" is fastest when applicable, strcat with prior NULL termination (important) is second.

For short strings in large arrays, strcat is fastest.

For longer strings in longer arrays, "b = a;" is again fastest, with memcpy second.

For huge arrays "b = a;" seems to be fastest.

Where possible use standard array assignment, however this is not always possible, for example when a string of unknown size is passed to a function. In these cases I would suggest using strcat (if you're interested, note the bizzare syntax):


#define strcpy(%0,%1,%2) \
strcat((%0[0] = '\0', %0), %1, %2)


Use:


strcpy(dest, src, sizeof (dest));


Credit to Simon for the suggestion.

Assumptions

Introduction

The bottom line is don't make assumptions!

Assumptions are saying you know what something always is just because it's often that. Example:


public OnGameModeInit()
{
AddPlayerClass(167, 0.0, 0.0, 5.0, 0.0, 0, 0, 0, 0, 0, 0); // Clucking Bell employee
AddPlayerClass(179, 0.0, 0.0, 5.0, 0.0, 0, 0, 0, 0, 0, 0); // Ammunation employee
}

public OnPlayerRequestClass(playerid, classid)
{
switch (classid)
{
case 0:
{
GameTextForPlayer(playerid, "Clucking Bell", 5000, 3);
}
case 1:
{
GameTextForPlayer(playerid, "Ammunation", 5000, 3);
}
}
return 1;
}


That's probably very familiar looking code to almost every one of you - setting up your mode's skins and doing things based on the selected one (I've omitted setting cameras here as it's irrelevant to the point). On your own private server this may be fine, you know there's only two skins, you know what order they're added in, and you know they'll never be modified - fine. But if you're going to release a mode this is VERY risky. A few problems which can arise:


A person using your mode also has filterscripts which add skins. This will throw off your class values.
A person using your mode decides they want to add skins and add them to the start of the list. This again throws off your class values.
A SA:MP version change alters the way IDs are assigned. Entirely altering your class IDs.


I'm sure there are more but those are the basics. The problem here is that you're using constants and assuming that they'll never change, either through mode modification or through AddPlayerClass not returning what you expect. There are a few solutions to this problem.

Solutions

Ignore it

If people want to use your mode differently to how you intended, that's their own problem and they can modify the code accordingly. I'm sure everyone here has experience with people asking why, when they add new vehicles to GF, do all the other house and job cars mess up. This is because GF, which was coded for a single server with the intention of not being modified, made assumptions about what cars there were.

This is clearly not a solution at all.

Make modifications easy

The second solution, which balances efficiency and modification, is to make the assumptions, but to minimise their usage. You may have commands which rely on the chose skin, in which case you may have code like:


dcmd_chicken(playerid, params[])
{
if (gClass[playerid] != 0) return SendClientMessage(playerid, 0xFF0000AA, "Sorry, you're not a Clucking Bell employee");
...
}


Your assumption that the Clucking Bell class is class 0 is now in your mode twice - once in OnPlayerRequestClass and once in dcmd_chicken. Assuming your mode is more than 20 lines long you could end up with this appearing hundreds of times, making modification a nightmare as you have to ensure you get every instance. The simplest way to get round this is use a define (or an enum):


#define CLUCKING_BELL_CLASS (0)
#define AMMUNATION_CLASS (1)

public OnGameModeInit()
{
AddPlayerClass(167, 0.0, 0.0, 5.0, 0.0, 0, 0, 0, 0, 0, 0); // Clucking Bell employee
AddPlayerClass(179, 0.0, 0.0, 5.0, 0.0, 0, 0, 0, 0, 0, 0); // Ammunation employee
}

public OnPlayerRequestClass(playerid, classid)
{
switch (classid)
{
case CLUCKING_BELL_CLASS:
{
GameTextForPlayer(playerid, "Clucking Bell", 5000, 3);
}
case AMMUNATION_CLASS:
{
GameTextForPlayer(playerid, "Ammunation", 5000, 3);
}
}
return 1;
}

dcmd_chicken(playerid, params[])
{
if (gClass[playerid] != CLUCKING_BELL_CLASS) return SendClientMessage(playerid, 0xFF0000AA, "Sorry, you're not a Clucking Bell employee");
...
}


Now when you add a new skin to the beginning of the list, or when you run a filterscript with it's own skins, you need only modify one part of your mode to reflect this change. However this can still cause problems if you can't guarantee that a return will always be constant.

Code defensively

The only way to guarantee that you'll never run into problems is to not make any assumptions at all. Save return values and use those known saves, don't try to guess what it might be:


new
gCluckingBellSkin = -1,
gAmmunationSkin = -1;

public OnGameModeInit()
{
gCluckingBellSkin = AddPlayerClass(167, 0.0, 0.0, 5.0, 0.0, 0, 0, 0, 0, 0, 0); // Clucking Bell employee
gAmmunationSkin = AddPlayerClass(179, 0.0, 0.0, 5.0, 0.0, 0, 0, 0, 0, 0, 0); // Ammunation employee
}

public OnPlayerRequestClass(playerid, classid)
{
if (classid == gCluckingBellSkin)
{
GameTextForPlayer(playerid, "Clucking Bell", 5000, 3);
}
else if (classid == gAmmunationSkin)
{
GameTextForPlayer(playerid, "Ammunation", 5000, 3);
}
return 1;
}

dcmd_chicken(playerid, params[])
{
if (gClass[playerid] != gCluckingBellSkin) return SendClientMessage(playerid, 0xFF0000AA, "Sorry, you're not a Clucking Bell employee");
...
}


It now doesn't matter what the return values may or may not be because there's no way they'll change between being saved and used.

Important

There are times when assumptions are OK, as I said if you know your mode won't be released and you know you yourself won't modify it, or are willing to accept all the extra work involved in modifying it, then go for it. But don't be surprised when, some time down the line, everything breaks because you missed something.

Memory reduction

Recall this code from earlier:


new
gIsACar[] = {1, 0, 0, 1, 1, 0, 1, 0, 1, 1},
gIsAHeavyVehicle[] = {1, 0, 1, 0, 0, 0, 0, 0, 1, 0},
gIsABoat[] = {0, 1, 0, 0, 0, 0, 0, 1, 0, 0},
gIsAFireEngine[] = {0, 0, 1, 0, 0, 1, 0, 0, 0, 0};


In PAWN all variables are 32bits big, that means they can store up to 4294967296, this code is only storing 0 or 1 - you can do that in a single bit. There are ten vehicles, each with four pieces of information (ignoring mutually exclusive information like IsACar/IsABoat for now), that's only 40 pieces of binary information (i.e. 40 true/falses), at 32bits per cell that's two cells worth of data stored in 40 cells (5 bytes of data (bound to 8 bytes) stored in 160 bytes - a 32fold increase and VAST waste). There are a few ways to easily reduce this use (although none listed here will attain full compression). The first is to mark all vehicles with an attribute in a single variable, the second is to mark all vehicle attributes in a single variable.

All vehicles

Each vehicle is either something or it isn't; for this example they're either a car or they're not. There are also 10 vehicles, this means there are 10 bits of binary data which we'll be storing in a single variable, that's still a waste of 22 bits, but that's better than 310, and up to 32 vehicles will reduce waste, not increase it.


Bit: 0 1 2 3 4 5 6 7 8 9 10 ... 31
Val: 1 2 4 8 16 32 64 128 256 512 1024 ... 2147483648
1/0: 1 0 0 1 1 0 1 0 1 1 x ... x


x means we don't care about the value of these bits as they don't represent vehicles (we could even in theory stick another bit of information in these wasted bits, but won't for now). Assuming all the other bits are 0 then this number is "857", unfortunately this doesn't mean a lot looking at it in decimal, so we need to read it in binary.

Let's say we want to find out if vehicle model 403 is a car. Firstly we subtract 400 to get the number in range then we need to test the fourth bit (0, 1, 2, 3). The fourth bit has a value of eight, so we need to test whether the number 857 has the eight bit set, this is done using bitwise AND:


if (857 & 8)
{
// Bit is set, the vehicle is a car
}


Or, using our array (which is now just a variable as we've reduced it to 1 cell):


if (gIsACar & 8)
{
// Bit is set, the vehicle is a car
}


That's great, but currently this will produce code like:


switch (model - 400)
{
case 0: if (gIsACar & 1) ...
case 1: if (gIsACar & 2) ...
case 2: if (gIsACar & 4) ...
case 3: if (gIsACar & 8) ...
case 4: if (gIsACar & 16) ...


Which is pointless as if you were going to go to that trouble you could just do:


switch (model - 400)
{
case 0: return true;
case 1: return false;
case 2: return false;
case 3: return true;
case 4: return true;
}


We need some way to generate the bit from the model. 1 is 2^0 (where ^ is "to the power of", not XOR), 2 is 2^1, 4 is 2^2 and 8 is 2^3, so 8, the bit we want, is 2^3, where 3 is 403-400:


new
bit = 1;
model -= 400;
for (new i = 0; i < model; i++)
{
bit *= 2;
}


That will get the correct result, but there's a far simpler way of doing two to the power of n, the left shift operator:


8 = 1 << 3


So now we can just do:


if (gIsACar & (1 << (model - 400)))
{
return true;
}
return false;


Better than that, if checks a boolean which when true we return true and when false we return false, so we can skip the check entirely:


stock IsACar(model)
{
return gIsACar & (1 << (model - 400));
}


That's now such a small function you could just make it a define:


#define IsACar(%0) \
(gIsACar & (1 << ((%0) - 400)))


All attributes

Every vehicle has a mark for if they're a car, a mark for if they're a boat, a mark for if they're a fire engine and a mark for if they're heavy. This is 4 bits of information per vehicle, and as we've just shown you can store multiple pieces of information in a single cell using bits, where each bit is a single true/false and we can access those bits individually. So what if, instead of using each bit for a different vehicle, we used each bit for a different attribute; so if bit 0 is set it's a car, if bit 1 is set it's a boat, bit 2 is heavy and bit 3 is fire engine? Let's look at an example:

Model 402 is a heavy fire engine, it's not a car or boat. So we have bits 2 and 3 set, that's 4 and 8, 4 + 8 (technically 4 | 8) is 12, so we have:


new gModel402 = 12;


Now we want to check if it's a boat, that's bit 2, or the number 2:


return gModel402 & 2;


This will return false (if it's true it will return TWO, not ONE, as many people incorrectly check for).

This can be turned into an array of attributes for all models:


new gModels[] = {x, x, 12, x, x, x, x, x, x, x};


(x means unknown other)


return gModels[model - 400] & 2;


There is no way to simplify the 2, you would simply have functions like:


#define IsACar(%0) \
(gModels[(%0) - 400] & 1)

#define IsABoat(%0) \
(gModels[(%0) - 400] & 2)

#define IsAHeavy(%0) \
(gModels[(%0) - 400] & 4)


This can't be optimised any more, but it can be made more readable and more easily editable. If I told you that one model was type 10, you would probably have to look at the above lists to see that it was a fire boat, but if I told you that one was a heavy car then you would instantly know what it was. So let's do that instead:


#define MODEL_CAR (1)
#define MODEL_BOAT (2)
#define MODEL_HEAVY (4)
#define MODEL_FIRE (8)

new gModels[] =
{
x,
x,
MODEL_HEAVY | MODEL_FIRE,
x,
x,
x,
x,
x,
x,
x
};

#define IsACar(%0) \
(gModels[(%0) - 400] & MODEL_CAR)

#define IsABoat(%0) \
(gModels[(%0) - 400] & MODEL_BOAT)

#define IsAHeavy(%0) \
(gModels[(%0) - 400] & MODEL_HEAVY)


You can now instantly tell what a vehicle is from it's array entry, but there's an even better way of doing this. I'm not going to go into too much detail on this as it's explained in pawn-lang.pdf but you can do:


enum (<<= 1)
{
MODEL_CAR = 1,
MODEL_BOAT,
MODEL_HEAVY,
MODEL_FIRE
}


The rest of the code is exactly the same but this is less typing and means you can add something between MODEL_CAR and MODEL_BOAT without needing to update any values.

More that 32 values

What happens when you have more than 32 values that you want to store? In this case you need an array indexed by how many times over 32 the value is. So 1 would be cell 0 (as it's not over 32), bit 2, 32 would be 1,0 and 66 would be 2,3. There is code for the simplification of this in YSI in the form of YSI_bit. In this case your code would look something like:


enum e_MODELS
{
MODEL_CAR,
MODEL_BOAT,
MODEL_HEAVY,
MODEL_FIRE,
...
MODEL_SOMETHING
}

new Bit:gModels[410 - 400];

#define IsACar(%0) \
(Bit_Get(gModels[(%0) - 400], MODEL_CAR))

#define IsSomething(%0) \
(Bit_Get(gModels[(%0) - 400], MODEL_SOMETHING))


The latest code for this (I rewrote it while doing this tutorial as it made me revisit my own code) can be found here (http://y-less.pastebin.ca/1274940).

[b]Excess dimensions

I don't know why people do this, I'm pretty sure they copy it from one of the more common modes, but that doesn't make it right and I don't know why it was done this way in the first place. If you have an array of values, don't waste dimensions. Example:

I have an array of 10 values, let's for the sake of argument call them weapon prices. So we have 10 weapons, each with a price, and we want to store them in an array. Each weapon has an ID, from 0 to 9, so to get that weapon's price you need to access that index in the array:


new
gPrices[10] = // 10 weapons, thus 10 prices
{
1000,
2000,
5000,
2000,
10000,
500,
3000,
2000,
100,
750
};



new
weaponPrice = gPrices[5];


That's all you need - it's BASIC array access, and yet for some reason people insist on doing the following:


new
gPrices[10][1] = // 10 weapons, thus 10 prices
{
{1000},
{2000},
{5000},
{2000},
{10000},
{500},
{3000},
{2000},
{100},
{750}
};



new
weaponPrice = gPrices[5][0];


What purpose does the extra dimension serve? None at all! If you had two prices per weapon then yes - you would need the extra dimension, but you don't so you don't - just don't do it, simple as! It's a waste of time - it's slower and a waste of space - it's bigger.

CPU vs Memory

This is such a common factor in writing code that it deserves a special mention. In almost everything you have a choice of how to do something, usually one way will be fast but use a lot of memory and the other will be slow but use very little memory. For example in the foreach example above the foreach code is faster, but uses a big array, IPC is slower but uses next to no memory as it has nothing to store. In all these cases you just have to make the decision based on circumstances. The previous section on memory reduction listed all sorts of ways to use less memory but they all used extra code to do it, making them slower than the original huge arrays. However in this case the reduction was so great (an approximately thirty-two fold reduction in memory use) that it more than offset the minor speed decrease (as stated above also bitwise operators are very fast).

There is, however, a third option - complexity.

In the foreach/IPC example foreach had the greater speed and IPC has the better memory footprint, but it's possible to combine the best of both worlds by writing more complex code. More complex code generally handles more situations explicitly, meaning you don't have to write generic code to handle all possibilities. In the player loop example this would manifest itself as something like:


if (IsPlayerConnected(0))
{
// Do something
}
if (IsPlayerConnected(1))
{
// Do something
}
if (IsPlayerConnected(2))
{
// Do something
}

...

if (IsPlayerConnected(199))
{
// Do something
}


Clearly this is very inefficient looking code and very hard to maintain, but you've got rid of the loop and variable, making the code faster and giving it a 0 cell memory footprint. From the table we also know that constants (0, 1 etc) are faster than variables (i).. I've not clocked this version so I don't know which is faster for large numbers of players (there's no contest at low numbers) between it and foreach, but it is definitely faster than an IPC loop. Clearly this example is still better with more memory usage but there are circumstances where it may not be.

Lists

This and the next setion are basically ripped straight from the wiki (http://wiki.sa-mp.com/wiki/Advanced_Structures) as I wrote it there in the first place.

Lists are a very useful type of structure, they're basically an array where the next piece or relevant data is pointed to by the last piece.

Example:

Say you have the following array:


3, 1, 64, 2, 4, 786, 2, 9


If you wanted to sort the array you would end up with:


1, 2, 2, 3, 4, 9, 64, 786


If however you wanted to leave the data in the original order but still know the numbers in order for some reason (it's just an example), you have a problem, how are you meant to have numbers in two orders at once? This would be a good use of lists. To construct a list from this data you would need to make the array into a 2d array, where the second dimension was 2 cells big, one containing the original number, the other containing the index of the next largest number. You would also need a separate variable to hold the index of the lowest number, so your new array would look like:


start = 1
3, 1, 64, 2, 4, 786, 2, 9
4, 3, 5, 6, 7, -1, 0, 2


The next index associated with 786 is -1, this is an invalid array index and indicates the end of the list, i.e. there are no more numbers. The two 2's could obviously be either way round, the first one in the array is the first on in the list too as it's the more likely one to be encountered first.

The other advantage of this method of sorting the numbers is adding more numbers is a lot faster. If you wanted to add another number 3 to the sorted array you would need to first shift at least 4 numbers one slot to the right to make space, not terrible here but very slow in larger arrays. With the list version you could just append the 3 to the end of the array and modify a single value in the list;


start = 1
3, 1, 64, 2, 4, 786, 2, 9, 3
8, 3, 5, 6, 7, -1, 0, 2, 4
^ modify this value ^ next highest slot


None of the other numbers have moved so none of the other indexes need updating, just make the next lowest number point to the new number and make the new number point the number the next lowest used to be pointing to. Removing a value is even easier:


start = 1
3, 1, 64, X, 4, 786, 2, 9, 3
8, 6, 5, 6, 7, -1, 0, 2, 4
^ Changed to jump over the removed value


Here the first 2 has been removed and the number which pointed to that number (the 1) has been updated to point to the number the removed number was pointing to. In this example neither the removed number's pointer nor number have been removed, but you cannot possibly get to that slot following the list so it doesn't matter, it is effectively removed.

Types

The lists in the examples above were just basic single lists, you can also have double lists where every value points to the next value and the last value, these tend to have a pointer to the end of the list too to go backwards (e.g. to get the numbers in descending order):


start = 1
end = 5
value: 3, 1, 64, 2, 4, 786, 2, 9, 3
next: 8, 3, 5, 6, 7, -1, 0, 2, 4
last: 6, -1, 7, 1, 8, 2, 3, 4, 0


You have to be careful with these, especially when you have more than one of any value, that the last pointer points to the number who's next pointer goes straight back again, e.g this is wrong:


2, 3, 3
1, 2, -1
-1, 2, 0


The 2's next pointer points to the 3 in slot one, but that 3's last pointer doesn't go back to the two, both lists are in order on their own (as the two threes can be either way round) but together they are wrong, the correct version would be:


2, 3, 3
1, 2, -1
-1, 0, 1


Both of those lists start and end on the end two numbers, the back list in the wrong example started on the middle number.

The other type of list is the looping one where the last value points back to the first. The obvious advantage to this is that you can get to any value from any other value without knowing in advance whether the target is before or after the start point, you just need to be careful not to get into an infinite loop as there's no explicit -1 end point. These lists do still have start points. You can also do double looping lists where you have a next and last list, both of which loop round:


start = 1
end = 5
3, 1, 64, 2, 4, 786, 2, 9, 3
8, 3, 5, 6, 7, 1, 0, 2, 4
6, 5, 7, 1, 8, 2, 3, 4, 0


Mixed lists

Mixed lists are arrays containing multiple lists at once. An example could be an array of values, sorted by a list, with another list linking all unused slots so you know where you can add a new value. Example (X means unused (free) slot):


sortedStart = 3
unusedStart = 1
value: 34, X, X, 6, 34, 46, X, 54, 23, 25, X, 75, X, 45
sort: 4, 8, 13, 7, 11, 9, 0, -1, 5
free: 2, 6, 10, 12, -1


Obviously the two lists never interact so both can use the same slot for their next value:


sortedStart = 3
unusedStart = 1
value: 34, X, X, 6, 34, 46, X, 54, 23, 25, X, 75, X, 45
next: 4, 2, 6, 8, 13, 7, 10, 11, 9, 0, 12, -1, -1, 5


Code

Before you start the code you need to decide what sort of list is best suited for your application, this is entirely based on application can't easily be covered here. All these examples are mixed lists, one list for the required values, one for unused slots.

This example shows how to write code for a list sorted numerically ascending.


#define NUMBER_OF_VALUES (10)

enum E_DATA_LIST
{
E_DATA_LIST_VALUE,
E_DATA_LIST_NEXT
}

new
gListData[NUMBER_OF_VALUES][E_DATA_LIST],
gUnusedStart = 0,
gListStart = -1; // Starts off with no list

// This function initializes the list
List_Setup()
{
new
i;
size--;
for (i = 0; i < size; i++)
{
// To start with all slots are unused
gListData[i][E_DATA_LIST_NEXT] = i + 1;
}
// End the list
gListData[E_DATA_LIST_NEXT] = -1;
}

// This function adds a value to the list (using basic sorting)
List_Add(value)
{
// Check there are free slots in the array
if (gUnusedStart == -1) return -1;
new
pointer = gListStart,
last = -1
slot = gUnusedStart;
// Add the value to the array
gListData[slot][E_DATA_LIST_VALUE] = value;
// Update the empty list
gUnusedStart = gListData[slot][E_DATA_LIST_NEXT];
// Loop through the list till we get to bigger/same size number
while (pointer != -1 && gListData[pointer][E_DATA_LIST_VALUE] < value)
{
// Save the position of the last value
last = pointer
// Move on to the next slot
pointer = gListData[pointer][E_DATA_LIST_NEXT];
}
// If we got here we ran out of values or reached a larger one
// Check if we checked any numbers
if (last == -1)
{
// The first number was bigger or there is no list
// Either way add the new value to the start of the list
gListData[slot][E_DATA_LIST_NEXT] = gListStart;
gListStart = slot;
}
else
{
// Place the new value in the list
gListData[slot][E_DATA_LIST_NEXT] = pointer;
gListData[last][E_DATA_LIST_NEXT] = slot;
}
return slot;
}

// This function removes a value from a given slot in the array (returned by List_Add)
List_Remove(slot)
{
// Is this a valid slot
if (slot < 0 || slot >= NUMBER_OF_VALUES) return 0;
// First find the slot before
new
pointer = gListStart,
last = -1;
while (pointer != -1 && pointer != slot)
{
last = pointer;
pointer = gListData[pointer][E_LIST_DATA_NEXT];
}
// Did we find the slot in the list
if (pointer == -1) return 0;
if (last == -1)
{
// The value is the first in the list
// Skip over this slot in the list
gListStart = gListData[slot][E_LIST_DATA_NEXT];
}
else
{
// The value is in the list
// Skip over this slot in the list
gListData[last][E_LIST_DATA_NEXT] = gListData[slot][E_LIST_DATA_NEXT];
}
// Add this slot to the unused list
// The unused list isn't in any order so this doesn't matter
gListData[slot][E_LIST_DATA_NEXT] = gUnusedStart;
gUnusedStart = slot;
return 1;
}


Binary trees

Binary trees are a very fast method of searching for data in an array by using a very special list system. The most well known binary tree is probably the 20 questions game, with just 20 yes/no questions you can have over 1048576 items. A binary tree, as it's name implies, is a type of tree, similar to a family tree, where every item has 0, 1 or 2 children. They are not used for ordering data like a list but sorting data for very efficient searching. Basically you start with an item somewhere near the middle of the ordered list of objects (e.g. the middle number in a sorted array) and compare that to the value you want to find. If it's the same you've found your item, if it's greater you move to the item to the right (not immediately to the right, the item to the right of the middle item would be the item at the three quarter mark), if it's less you move left, then repeat the process.

[size=2]Example


1 2 5 6 7 9 12 14 17 19 23 25 28 33 38


You have the preceding ordered array and you want to find what slot the number 7 is in (if it's in at all), in this example it's probably more efficient to just loop straight through the array to find it but that's not the point, that method increases in time linearly with the size of the array, a binary search time increases linearly as the array increases exponentially in size. I.e. an array 128 big will take twice as long to search straight through as an array 64 big, but a binary search 128 big will only take one check more than a binary search 64 big, not a lot at all.

If we construct a binary tree from the data above we get:

http://wiki.sa-mp.com/wroot/images2/f/fe/Binarytree.GIF

If you read left to right, ignoring the vertical aspect you can see that the numbers are in order. Now we can try find the 7.

The start number is 14, 7 is less than 14 so we go to the slot pointed to by the left branch of 14. This brings us to 6, 7 is bigger than 6 so we go right to 9, then left again to 7. This method took 4 comparisons to find the number (including the final check to confirm that we are on 7), using a straight search would have taken 5.

Lets say there is no 7, we would end up with this binary tree:

http://wiki.sa-mp.com/wroot/images2/e/e5/Binarytree-7-less.GIF

This, unlike the example above, has a single child number (the 9), as well as 2 and 0 child numbers. You only get a perfect tree when there are (2^n)-1 numbers (0, 1, 3, 7, 15, 31 ...), any other numbers will give a not quite full tree. In this case when we get to the 9, where the 7 will be, we'll find there is no left branch, meaning the 7 doesn't exist (it cannot possibly be anywhere else in the tree, think about it), so we return -1 for invalid slot.

Balanced and unbalanced

The trees in the examples above are called balanced binary trees, this means as near as possible all the branches are the same length (obviously in the second there aren't enough numbers for this to be the case but it's as near as possible). Constructing balanced trees is not easy, the generally accepted method of constructing almost balanced trees is putting the numbers in in a random order, this may mean you end up with something like this:

http://wiki.sa-mp.com/wroot/images2/a/a2/Binarytree-uneven.GIF

Obviously this tree is still valid but the right side is much larger than the left, however finding 25 still only takes 7 comparisons in this compared to 12 in the straight list. Also, as long as you start with a fairly middle number the random insertion method should produced a fairly balanced tree. The worst possible thing you can do is put the numbers in in order as then there will be no left branches at all (or right branches if done the other way), however even in this worst case the binary tree will take no longer to search than the straight list.

Modification

Addition

Adding a value to a binary tree is relatively easy, you just follow the tree through, using the value you want to add as a reference untill you reach an empty branch and add the number there. E.g. if you wanted to add the number 15 to our original balanced tree it would end up on the left branch of the 17. If we wanted to add the number 8 to the second balanced tree (the one without the 7) it would end up in the 7's old slot on the left of the 9.

Deletion

Deleting a number from a binary tree can be hard or it can be easy. If the number is at the end of a branch (e.g. 1, 5, 7, 12 etc in the original tree) you simply remove them. If a number only has one child (e.g. the 9 in the second example) you simply move that child (e.g. the 12) up into their position (so 6's children would be 2 and 12 in the new second example with 9 removed). Deletion only gets interesting when a node has two children. There are at least four ways of doing this:

The first method is the simplest computationally. Basically you choose one of the branches (left or right, assume right for this explanation) and replace the node you've removed with the first node of that branch (i.e. the right child of the node you've removed). You then go left through than new branch till you reach the end and place the left branch there. E.g. if you removed the 14 from the original example you would end up with 25 taking it's place at the top of the tree and 6 attached to the left branch of 17. This method is fast but ends up with very unbalanced trees very quickly.

The second method is to get all the numbers which are children of the node you just removed and rebuild a new binary tree from them, then put the top of that tree into the node you've just removed. This keeps the tree fairly well balanced but is obviously slower.

The third method is to combine the two methods above and rebuild the tree in-line, this is more complex to code but keeps the tree balanced and is faster than the second method (though no-where near as fast as the first).

The final method listed here is to simply set a flag on a value saying it's not used any more, this is even faster than the first method and maintains the structure but means you can't re-use slots unless you can find a value to replace it with later.

Y_Less
03/12/2008, 03:07 PM
Reserved

Y_Less
03/12/2008, 03:08 PM
Reserved

Y_Less
03/12/2008, 03:08 PM
Reserved

Karlip
03/12/2008, 06:33 PM
Gosh. Err... I suppose I can say 'oh wow, he done it again' - but I won't. Are you sure that you're actually not the creator of PAWN? :3


Lol.

Good job Y_less.
This is awesome! :D
*bow*

Y_Less
03/12/2008, 06:35 PM
Gosh. Err... I suppose I can say 'oh wow, he done it again' - but I won't. Are you sure that you're actually not the creator of PAWN? :3


No, most of these techniques aren't specific to PAWN, in fact very little of the actual information is anything to do with PAWN, just the context and examples.

yom
03/12/2008, 06:50 PM
Your best post ever :) Thanks

I remember you explained me binary trees long time ago. Never used it really, sounds complex still :)

JaTochNietDan
03/12/2008, 07:33 PM
Good job Alex, thanks for this. I will be sure to use your advice :)

FUNExtreme
03/12/2008, 09:38 PM
very nice tutorial :)
but WHERE do you get the TIME?

*claps his hands blue*

Ytong
03/12/2008, 09:47 PM
Omg o_o. Very good advices. At the moment I'm trying to learn and use bit manipulation correctly, so this may help me in a way.
But I stopped reading after the second post because else my brain would have exploded ._.
I'm going to read the last post tommorow :D
Thanks very much (I think I've learned more from you, than I will ever learn from my ******** ********** ******* ********** ********* coding teachers in school ._. (I'm now nearly 18 and we're learning how to make two-dimensional-arrays -.-))

Backwardsman97
03/12/2008, 10:32 PM
Thanks for this. It really does help if I could understand it. :lol: Nah, I see what you're saying in most parts.

Y_Less
03/12/2008, 11:54 PM
very nice tutorial :)
but WHERE do you get the TIME?

*claps his hands blue*



But I stopped reading after the second post because else my brain would have exploded ._.
I'm going to read the last post tommorow :D


In response to both these, this has been worked on for the last few weeks, started almost straight after the strings one and worked on now and then in between whenever I had a few spare minutes. I don't expect anyone to read it in a day, in fact anything less than a week and you'll probably miss something.

Speaking of which I'm sure I've missed loads of things, and I'm sure there are loads of techniques I don't even know, so if you think of anything PLEASE share them, I want to learn as much as the rest of you.

Y_Less
04/12/2008, 05:44 PM
OK, I've added a new section on return values, if you've read everything else it's fairly stand alone so you may want to go back and read that.

Finn
04/12/2008, 06:53 PM
Holy-fucking-shit.

Only by reading the first post I noticed how stupid I've been when scripting pawn. I think I gotta fix many parts of my script to make it run smoothier.

I don't understand those <<'s and >>>'s, but as my script works fine without them, I don't think I need to learn how to use them.

Ytong
04/12/2008, 07:54 PM
I don't understand those <<'s and >>>'s, but as my script works fine without them, I don't think I need to learn how to use them.


new
for(new i; i < 1000; i++)
{
for(new playerid = 0; playerid < MAX_PLAYERS; playerid++)
{
if(!IsPlayerConnected(playerid))
{
new
somevar = 3;

for(new j; j < 100; j++)
{
somevar = somevar * 2;
somevar = somevar * 4;
somevar = somevar * 6;
somevar = somevar * 8;
somevar = somevar * 10;
somevar = somevar / 2;
somevar = somevar / 4;
somevar = somevar / 6;
somevar = somevar / 8;
somevar = somevar / 10;
}
}
}
}

Does also "work" but it's very bad coded.
The following is as long not the best, as it doesn't use the foreach syntax but it's clearly way faster.


for(new i; i < 1000; i++)
{
new
MaxPlayers = GetMaxPlayers();
for(new playerid = 0; playerid < MaxPlayers; playerid++)
{
if(!IsPlayerConnected(playerid))
{
new
somevar = 3;

for(new j; j < 100; j++)
{
somevar <<= 1;
somevar <<= 2;
somevar <<= 3;
somevar <<= 4;
somevar <<= 5;

somevar >>= 1;
somevar >>= 2;
somevar >>= 3;
somevar >>= 4;
somevar >>= 5;
}
}
}
}



If you're assuming that there's a full server with 200 players connected then the first example needs 7359 milliseconds the second just 2250. so the first one needs more than 3 times more server effort...
Just "working" doesn't mean it's good.

Y_Less
04/12/2008, 09:25 PM
Those aren't actually the same, <<1 is *2, <<2 is *4, <<3 is *8, <<4 is *16 and <<5 is *32.

Ytong
06/12/2008, 01:30 PM
argh damn it, I added 2 instead of multiplying with 2

*Tjong hits himself

But it I think the result wouldn't be changed too much.

Maniek
06/12/2008, 01:56 PM
(...)
So for example:


for (new i = 0; i < MAX_PLAYERS; i++)
{
}


Is faster than:


for (new i = 0, j = GetMaxPlayers(); i < j; i++)
{
}

(...)


Yes MAX_PLAYERS is fest but if We add in OnGameModeInit()
MAX_SLOTS = GetMaxPlayers();
The MAX_SLOTS is better then MAX_PLAYERS
I think so :roll:

Norn
06/12/2008, 02:00 PM
(...)
So for example:


for (new i = 0; i < MAX_PLAYERS; i++)
{
}


Is faster than:


for (new i = 0, j = GetMaxPlayers(); i < j; i++)
{
}

(...)


Yes MAX_PLAYERS is fest but if We add in OnGameModeInit()
MAX_SLOTS = GetMaxPlayers();
The MAX_SLOTS is better then MAX_PLAYERS
I think so :roll:


I thought GetMaxPlayers gets the amount you defined in Server.CFG, i could be wrong.

Antironix
06/12/2008, 02:08 PM
Yup, that's true Norn. Just undefine MAX_PLAYERS and redifine it at the top of your script, or change it in your samp.inc.

Y_Less
08/12/2008, 12:45 AM
(...)
So for example:


for (new i = 0; i < MAX_PLAYERS; i++)
{
}


Is faster than:


for (new i = 0, j = GetMaxPlayers(); i < j; i++)
{
}

(...)


Yes MAX_PLAYERS is fest but if We add in OnGameModeInit()
MAX_SLOTS = GetMaxPlayers();
The MAX_SLOTS is better then MAX_PLAYERS
I think so :roll:


No, read the section on "speed".

Simon
08/12/2008, 03:38 AM
if ( !strlen( szTmp ) ) // This will be slower when you have a full string.
if ( !szTmp[ 0 ] ) ) // This should be faster when you have a full string, same result.


I think this could be a way to optimize checks for no string. Not sure how it works internally but I've always thought of strlen as a loop looking for NULL character, if you know the NULL character in an empty string is at the beginning then what's the point in looping at all?

Y_Less
08/12/2008, 04:46 AM
That is how to check for an empty string, and that's exactly how strlen works internally, which is also why the string loop in my posts are better than using strlen as it means you only loop through once, not twice. I'll add that to the post.

Edit: Added a section on small snippets like that that don't really deserve a full section.

Toribio
09/12/2008, 04:49 AM
Nice topic, when you update the topic, I suggest you to show how to use triadic operations. It'll help many people :)

Something like (for those who don't know):
new
bool:a = true,
b;
if(a == true)
{
b = 15;
} else {
b = 9;
}

So, if "a" is true, "b" will be 15, else, "b" will be 9.
Shoud be simplified to
new
bool:a = true,
b = (a == true) ? 15 : 9;


More simplified:
new
bool:a = true,
b = a ? 15 : 9;


All are the same...

MenaceX^
12/12/2008, 12:58 AM
Wow..
Thanks alot Alex .

LarzI
13/12/2008, 11:56 AM
OMG!

I just C/P'd all his "notes" into pawno, and I just found out (without pictures of course) this "tutorial" has 1552 lines! :o :o

http://pastebin.com/f5d1a0668
There's the pastebin for these topics (It will be deleted in a day)

____

Well, It's Y_Less ;)

Extremo
14/12/2008, 08:40 AM
Hehe, I love the whole topic, been reading everything Y_Less has posted so far, always helped me alot, much luff for u alex :D

Ycto
18/12/2008, 09:19 AM
W..o..w.. :O
Really useful, thanks!

Y_Less
18/12/2008, 02:22 PM
I don't understand those <<'s and >>>'s, but as my script works fine without them, I don't think I need to learn how to use them.


That's a stupid statement! The whole point of this post was that there are usually alternate ways of doing things, yes, your script might work without them, but that doesn't mean it won't work better with them (although it may not).

Backwardsman97
19/12/2008, 07:58 AM
I don't understand those <<'s and >>>'s, but as my script works fine without them, I don't think I need to learn how to use them.


That's a stupid statement! The whole point of this post was that there are usually alternate ways of doing things, yes, your script might work without them, but that doesn't mean it won't work better with them (although it may not).


I don't understand those but I would like to.

Y_Less
19/12/2008, 10:05 AM
That, as I stated in the tutorial, is what pawn-lang.pdf is for.

Finn
19/12/2008, 08:40 PM
That, as I stated in the tutorial, is what pawn-lang.pdf is for.

This .pdf does not explain those <<'s and >>>'s good enough for me, or atleast I don't understand what's the meaning of them, maybe my bad english but whatever lol. Well, it doesn't matter as there are millions of ways to do the same thing differently.

Another question; I've seen a lot of people defining their functions instead of using stock. Do these have any difference, is the other one any faster or what, why are you using them?

#define Something(%0) ( ohai[%0]++ )

stock Something(value) { ohai[value]++ }

Yes, the define maybe fucked up, but don't look at it that way. Does the define work faster than the stock one, or are you guys using them only to look a little bit more professional than everyone else?

Ytong
19/12/2008, 09:47 PM
Another question; I've seen a lot of people defining their functions instead of using stock. Do these have any difference, is the other one any faster or what, why are you using them?




[anchor=order] Speed order

Different language features take different times to execute, in general the order is (from fastest to slowest):


Nothing
Constants
Variables
Arrays
Native functions
Custom functions
Remote functions



Your example #define would deal with a Array or maybe in another place with normal variables, as you can see in the chart the stock function (Custom function) would need more time...

Y_Less
19/12/2008, 10:32 PM
Defines place the code inline, meaning if you call it a hundred times you'll get a hundred copies of the code, where as functions are only done once, so calling it a hundred times will just call the same single piece of code. Anyway, you need to be very careful with definitions if you don't understand how their parameters work as it can cause unexpected side effects. For example:


stock Print(num)
{
printf("%d %d", num, num);
}

main()
{
new
i = 1;
Print(i++);
Print(i++);
}


That will output:


1 1
2 2


Wheras:


#define Print(%0) printf("%d %d", %0, %0)

main()
{
new
i = 1;
Print(i++);
Print(i++);
}


Will output:


1 2
3 4


Or possibly:


2 1
4 3


I'm not entirely sure about execution order for pre/postfix operators in function parameters, but that's just another reason to not use them. Basically if you don't know why they're used, don't.

Ytong
20/12/2008, 11:00 AM
Or possibly:


2 1
4 3


I'm not entirely sure about execution order for pre/postfix operators in function parameters, but that's just another reason to not use them. Basically if you don't know why they're used, don't.


The parameter execution starts at the last and ends at the first.
It just happened to me ones, a year ago as I tried something like this:


new
string[64],
index = 0;
string = "411;123.4;456.7;89.0";
CreateVehicle(strval(strtok(string,index,';')), floatstr(strtok(string,index,';')), floatstr(strtok(string,index,';')), floatstr(strtok(string,index,';')), 0.0, -1, -1, -1);


Please don't kill me, because of the strtok thing :S (This was one of my first scripts I wrote =/ )

Well as the parameter execution goes backwards, my game crashed, as the modelid of the car would have been "89" in this case and I wasn't able to detect the mistake for hours...

Y_Less
20/12/2008, 01:02 PM
Functions and postfix operators are different, functions are called before something happens, postfix operators after, hence why I wasn't sure for my example, but do know for yours.

PhyroIS
21/12/2008, 09:27 PM
Very Nice, I Read All :P

x-cutter
26/12/2008, 03:20 PM
Just to be sure I understood :

using if(!strlen(string)) is slower than simply doing if(!string[0]) ?

thanks

Y_Less
26/12/2008, 04:54 PM
Yes.

Y_Less
27/12/2008, 04:07 PM
New update:

Distance checks (http://forum.sa-mp.com/index.php?topic=79810.msg522270#post_distance)

I don't know why I didn't include this in the original version, it's possibly the most important one I've written!

kc
28/12/2008, 11:17 PM
Stunning post.

I learnt alot from it.

A question though - does using packed strings give a processor usage overhead compared with creating a regular string? or is the difference negligible?

Y_Less
29/12/2008, 01:39 AM
I wouldn't be surprised if packed strings were faster as they're closer to C strings.

Y_Less
03/01/2009, 05:30 PM
OK, I've just updated the first post with an EVEN FASTER distance check. I'm not talking a few extra milliseconds saved when you run it millions of times, this version is over 30 times faster!

Y_Less
01/02/2009, 01:58 PM
Update:

Added a section on assumptions and corrected some formatting in the first post. I'm not actually sure if the error was introduced when I updated the table of contents for the new section or if it was there already, but it's fixed either way.

Yaheli_Faro
04/02/2009, 01:40 PM
If I have a variable, i'll use vehicle ID's for example, should use
new Bit:vid;
or

new vid;
:?:

Y_Less
04/02/2009, 02:04 PM
Depends what you want to use it for. I personally use Bit: to indicate that it's a bit array. Vehicle IDs aren't bit arrays.

[LDT]LuxurY
15/02/2009, 03:49 PM
Today i was working on function that return string length:

The best result I got with this func:

stock slen ( s[] , &out ) while ( s[out] ) out++;

I've noticed that such func works faster than:

stock slen ( s[] , out = 0)
{
while ( s[out] ) out++;
return out;
}

So with slen I got that results (compared with standart strlen ( 1 column: slen | 2 column: strlen ) ):
Tested with 1.000.000 loop

Long string:
199 225 | Time
142 142 | Length

Short string:
199 129 | Time
12 12 | Length

Empty string:
197 118 | Time
0 0 | Length

With long strings slen works faster, but with short or empty strings standart strlen works faster.
Testing code you can find here (http://pawn.pastebin.com/f5589ed48).

Then I decided to make a define and it worked extremely fast.

native xlen ( s[] , &out );

#define xlen(%1,%2) while ( %1[%2] ) %2++
So with xlen I got that results (compared with standart strlen ( 1 column: xlen | 2 column: strlen ) ):
Tested with 1.000.000 loop

Long string:
110 226 | Time
142 142 | Length

Short string:
110 129 | Time
12 12 | Length

Empty string:
111 117 | Time
0 0 | Length

So it works even faster than strlen.
Testing code you can find here (http://pawn.pastebin.com/f3845369f).

So defines works faster than functions.

Good luck,
[LDT]LuxurY.

Y_Less
15/02/2009, 04:58 PM
They may do in that situation, doesn't mean they always do. And I always use:

if (!str[0])

To check if a string is empty, not:

if (!strlen(str))

Anyway, try:


/*
native xlen ( s[] , &out ); // Correctly commented non-existant native
*/

#define xlen(%1,%2) while ( %1[%2++] )


That should be faster (although it does depend on the compiler).

Also, have you ever heard of braces?

Edit: The code I posted gives some interesting possibilities:


new
a,
str[] = "hello";
// Get the length
xlen(str, a);

// Use it as a loop
xlen(str, a)
{
printf("str[%d] = %c", a, str[a]);
}


However the obvious advantage of function calls with returns over defines or pass-by-reference functions is that you can do:


new
str[] = "hello",
a = slen(str);


Or:


if (slen(str) == 4) {}


Which you can't do with your method.

[LDT]LuxurY
15/02/2009, 05:33 PM
your version works for me only as a loop example. for getting the length it gives:
error 036: empty statement

Edit:

new
a,
str[] = "hello";
// Get the length
xlen(str, a) { }
printf("%d", a);

gives 6.

correct:

#define xlen(%1,%2) while ( %1[++%2] )

anyway %1[++%2] or %1[%2++] in non-defined function worked slower.

Y_Less
15/02/2009, 06:16 PM
Sorry, in C it's "while (x) ;" for an empty loop, PAWN is, as you correctly said, "while (x) {}". I get muddled up sometimes.

In which case:


#define xlen(%1,%2) do {} while (%1[%2++])


That should allow you to put a semi-colon on the end properly and still function the same, although it now won't work as a loop.

Also, the other problem that just occurred to me nicely illustrated by these bits of code:


new
str[] = "hello",
a = 10;
a = strlen(str);
// a is now 5...



new
str[] = "hello",
a = 10;
xlen(str, a);
// Your script just crashed...


Note that your slen function is just as susceptible to this problem, this goes a tiny way to justify some of the extra overhead.

_________________________________________________

Edit:

Just realised another problem:


new
str[] = "hello";
gSomeVar[10][E_ENUM_ELEMENT] = strlen(str);



new
str[] = "hello";
xlen(str, gSomeVar[10][E_ENUM_ELEMENT]);


That, your way, will do a 2d array access and write every iteration. I'm not sure how much slower it will make the function but it will be significant.

_________________________________________________

Edit 2:

The other other problem is related to that one:


new
idx = 0,
str[] = "hello";
gSomeVar[idx++][E_ENUM_ELEMENT] = strlen(str);



new
idx = 0,
str[] = "hello";
xlen(str, gSomeVar[idx++][E_ENUM_ELEMENT]);


The first version will get the the length, save it to "gSomeVar[0][E_ENUM_ELEMENT]" and increment the index counter. The second version will increment the index pointer EVERY iteration, which, if it doesn't crash your script, will cause all sorts of problems and only return the correct length and correct new index if the string has length 0.

So, basically, yes defines can be faster that functions but we have functions for a reason.

[LDT]LuxurY
15/02/2009, 06:42 PM
yes, but we can always do:

new
str[] = "hello",
a = 10,
a=0;
xlen(str, a);

//script not crashed

new
a = 0,
str[] = "hello";
xlen(str,a);
gSomeVar[10][E_ENUM_ELEMENT] = a;


new
idx = 0,
a = 0,
str[] = "hello";

xlen(str,a);
gSomeVar[idx++][E_ENUM_ELEMENT] = a;


in other way we can't create function faster than a strlen.

Y_Less
15/02/2009, 07:17 PM
I know there are ways around them all, but if people don't understand how and why it works and use it like they would any other function there's a problem. Realisticly, if you release functions, you shouldn't have to have people reinvent the way they code. I don't like dudb for this reason as it introduces a . operator, same with xObjects using big arrays for objects instead of function calls*.

I've done this before too, but have updated scripts since.

Y_Less
26/02/2009, 05:43 PM
Update:

Added section on wasted dimensions.

CracK
11/03/2009, 04:05 AM
NameCheck(name[])
{
new
i,
ch;
while ((ch = name[i++]) && ((ch == '_') || ('0' <= ch <= '9') || ((ch |= 0x20) && ('a' <= ch <= 'z')))) {}
return !ch;
}

You forgot to add [ and ], maybe for some reason.
NameCheck(name[])
{
new
i,
ch;
while ((ch = name[i++]) && ((ch == ']') || (ch == '[') || (ch == '_') || ('0' <= ch <= '9') || ((ch |= 0x20) && ('a' <= ch <= 'z')))) {}
return !ch;
}

Y_Less
11/03/2009, 10:41 AM
Yes I did, fixed, thanks.

Double-O-Seven
01/05/2009, 10:15 AM
Why doesn't this work!?
(((x1 - x2) ^ 2) + ((y1 - y2) ^ 2) + ((z1 - z2) ^ 2)) ^ 0.5
^ doesn't fucking work Dx

yom
01/05/2009, 10:39 AM
Because ^ isn't the 'power' symbol, it is XOR symbol.

Double-O-Seven
01/05/2009, 10:41 AM
Why is there no power symbol? Dx

BP13
12/05/2009, 10:59 PM
what is this?

Backwardsman97
13/05/2009, 01:34 AM
Code optimization thread?

Span1ard
11/06/2009, 10:52 AM
Double-O-Seven ]
Why is there no power symbol? Dx

use floatpower (http://wiki.sa-mp.com/wiki/Scripting_Functions_Old#floatpower)

[BDC]Scarface
04/08/2009, 03:39 PM
Hi Y_Less,

Fantastic Walk-through. Although I am familiar with most, your alternative method to use the return value of functions to determine success and player connection is simple yet innovative... I also wasn't aware that SA:MP's player functions have an inbuilt IsPlayerConnected(). Thanks for taking the time to write this and inform the many script writers in this community.

I would also be interested in how much resources Get/SetProperty() use for Inter-Script communication. I've got one of my good friends thesis on Efficient coding of C/C++ and C# somewhere, I'll ask him if I can extract certain parts.

Also, I find that often (as you mentioned at the start) people overlook the simplest of improvements.

eg. When people use the MAX_PLAYERS definition (set to 200 in the a_samp.inc)... If your using this in loops and your server only has 100max players... you are cycling through 100 id's that will never exist. Simply change the "#define MAX_PLAYERS 200" line in the include or make your own definition eg. (#define MAXP 100).

That's only one example. There are several more.

Y_Less
05/08/2009, 08:52 AM
I would be interested to see that thesis if possible.

And yes, changing the define is important, but I'd argue that, given that I outlined a faster player loop in the main topic, it's more important for keeping array sizes small.

Zeex
12/08/2009, 06:49 PM
I didn't know at what topic can i ask about this...so i post my question here
Why nobody uses (at least at scripts i've ever seen) CallLocalFuncion in OnPlayerCommandText instead of strcmp or dcmd?

I've just tested this code:

#include <a_samp>

#define ITERATIONS (100000)

#define MAX_COMM_FUNC_NAME 32
CallCommandFunction(playerid, cmdtext[])
{
new
funcname[MAX_COMM_FUNC_NAME],
index = strfind(cmdtext, " ");
strmid(funcname, cmdtext, 1, index);
format(funcname, MAX_COMM_FUNC_NAME, "cmd_%s", funcname);
return CallLocalFunction(funcname, "is", playerid, cmdtext[index+1]);
}


public OnFilterScriptInit()
{
Test(0, "/command9 bla bla bla");
return 1;
}

Test(playerid, cmdtext[])
{
new
t0,
t1,
t2,
i;
t0 = GetTickCount();
for (i = 0; i < ITERATIONS; i++)
{
if (!strcmp(cmdtext, "/command1", false, 9)) print("Code1");
if (!strcmp(cmdtext, "/command2", false, 9)) print("Code1");
if (!strcmp(cmdtext, "/command3", false, 9)) print("Code1");
if (!strcmp(cmdtext, "/command4", false, 9)) print("Code1");
if (!strcmp(cmdtext, "/command5", false, 9)) print("Code1");
if (!strcmp(cmdtext, "/command6", false, 9)) print("Code1");
if (!strcmp(cmdtext, "/command7", false, 9)) print("Code1");
if (!strcmp(cmdtext, "/command8", false, 9)) print("Code1");
if (!strcmp(cmdtext, "/command9", false, 9)) print("Code1");
}
t1 = GetTickCount();
for (i = 0; i < ITERATIONS; i++)
{
if (!CallCommandFunction(playerid, cmdtext))
{
// Unknown command
}
}
t2 = GetTickCount();
printf("Time 1: %04d, time 2: %04d", t1 - t0, t2 - t1);
}

forward cmd_command9(playerid, params[]);
public cmd_command9(playerid, params[])
{
print("Code2");
return 1;
}

and got the result: time 1: 5172, time 2: 4969.. so the second version is faster than the first one
The more commands the slower first version works, right? And what the speed of the second code depends on? Number of local funcions? Or something else?

Y_Less
13/08/2009, 01:01 PM
Actually the YSI command system does use CallRemoteFunction, but your results are very interesting. The only thing I would point out is that that code doesn't work if you don't have any parameters, but that's a minor point. Also, have you tested the speed when you use a command that doesn't exist or when there are loads of public functions?

Zeex
13/08/2009, 05:49 PM
Ok...
When I added 100 public functions the speed of the second code has almost not changed.
And if entered command does not exists time_1 is MUCH more than time_2, in some tens times at least.

Also i noted that speed of the first code depends on the number of comparisons in strcmp() function between entered command text and some other one, i.e.

strcmp("/command1", "/command2");

will be slower than

strcmp("/command", "/kommand");


The script that I used for tests:

#include <a_samp.inc>


#define ITERATIONS (100000)

#define MAX_COMM_FUNC_NAME 32

#define cmdtest_addcommand(%1,%2,%3) \
if (!strcmp(%3[1], #%1, true, %2) && (!%3[%2 + 1] || %3[%2 + 1] == 32)) \
{ \
print("Code1"); \
continue; \
}

#define cmdtest_addpublic(%1) \
forward cmd_%1(playerid, params[]); \
public cmd_%1(playerid, params[]) \
{ \
print("Code2"); \
return 1; \
}


public OnFilterScriptInit()
{
Test(0, "/command0");
return 1;
}

CallCommandFunction(playerid, cmdtext[])
{
new
funcname[MAX_COMM_FUNC_NAME],
index;
if ((index = strfind(cmdtext, " ")) == -1) index = strlen(cmdtext);
strmid(funcname, cmdtext, 1, index);
format(funcname, MAX_COMM_FUNC_NAME, "cmd_%s", funcname);
return CallLocalFunction(funcname, "is", playerid, cmdtext[index+1]);
}

Test(playerid, cmdtext[])
{
new
t0,
t1,
t2,
i;

print("Starting test...");
t0 = GetTickCount();
for (i = 0; i < ITERATIONS; i++)
{
cmdtest_addcommand(command0, 8, cmdtext)
cmdtest_addcommand(command1, 8, cmdtext)
cmdtest_addcommand(command2, 8, cmdtext)
cmdtest_addcommand(command3, 8, cmdtext)
cmdtest_addcommand(command4, 8, cmdtext)
cmdtest_addcommand(command5, 8, cmdtext)
cmdtest_addcommand(command6, 8, cmdtext)
cmdtest_addcommand(command7, 8, cmdtext)
cmdtest_addcommand(command8, 8, cmdtext)
cmdtest_addcommand(command9, 8, cmdtext)
cmdtest_addcommand(command10, 9, cmdtext)
cmdtest_addcommand(command11, 9, cmdtext)
cmdtest_addcommand(command12, 9, cmdtext)
cmdtest_addcommand(command13, 9, cmdtext)
cmdtest_addcommand(command14, 9, cmdtext)
cmdtest_addcommand(command15, 9, cmdtext)
cmdtest_addcommand(command16, 9, cmdtext)
cmdtest_addcommand(command17, 9, cmdtext)
cmdtest_addcommand(command18, 9, cmdtext)
cmdtest_addcommand(command19, 9, cmdtext)
cmdtest_addcommand(command20, 9, cmdtext)
cmdtest_addcommand(command21, 9, cmdtext)
cmdtest_addcommand(command22, 9, cmdtext)
cmdtest_addcommand(command23, 9, cmdtext)
cmdtest_addcommand(command24, 9, cmdtext)
cmdtest_addcommand(command25, 9, cmdtext)
cmdtest_addcommand(command26, 9, cmdtext)
cmdtest_addcommand(command27, 9, cmdtext)
cmdtest_addcommand(command28, 9, cmdtext)
cmdtest_addcommand(command29, 9, cmdtext)
cmdtest_addcommand(command30, 9, cmdtext)
cmdtest_addcommand(command31, 9, cmdtext)
cmdtest_addcommand(command32, 9, cmdtext)
cmdtest_addcommand(command33, 9, cmdtext)
cmdtest_addcommand(command34, 9, cmdtext)
cmdtest_addcommand(command35, 9, cmdtext)
cmdtest_addcommand(command36, 9, cmdtext)
cmdtest_addcommand(command37, 9, cmdtext)
cmdtest_addcommand(command38, 9, cmdtext)
cmdtest_addcommand(command39, 9, cmdtext)
cmdtest_addcommand(command40, 9, cmdtext)
cmdtest_addcommand(command41, 9, cmdtext)
cmdtest_addcommand(command42, 9, cmdtext)
cmdtest_addcommand(command43, 9, cmdtext)
cmdtest_addcommand(command44, 9, cmdtext)
cmdtest_addcommand(command45, 9, cmdtext)
cmdtest_addcommand(command46, 9, cmdtext)
cmdtest_addcommand(command47, 9, cmdtext)
cmdtest_addcommand(command48, 9, cmdtext)
cmdtest_addcommand(command49, 9, cmdtext)
cmdtest_addcommand(command50, 9, cmdtext)
cmdtest_addcommand(command51, 9, cmdtext)
cmdtest_addcommand(command52, 9, cmdtext)
cmdtest_addcommand(command53, 9, cmdtext)
cmdtest_addcommand(command54, 9, cmdtext)
cmdtest_addcommand(command55, 9, cmdtext)
cmdtest_addcommand(command56, 9, cmdtext)
cmdtest_addcommand(command57, 9, cmdtext)
cmdtest_addcommand(command58, 9, cmdtext)
cmdtest_addcommand(command59, 9, cmdtext)
cmdtest_addcommand(command60, 9, cmdtext)
cmdtest_addcommand(command61, 9, cmdtext)
cmdtest_addcommand(command62, 9, cmdtext)
cmdtest_addcommand(command63, 9, cmdtext)
cmdtest_addcommand(command64, 9, cmdtext)
cmdtest_addcommand(command65, 9, cmdtext)
cmdtest_addcommand(command66, 9, cmdtext)
cmdtest_addcommand(command67, 9, cmdtext)
cmdtest_addcommand(command68, 9, cmdtext)
cmdtest_addcommand(command69, 9, cmdtext)
cmdtest_addcommand(command70, 9, cmdtext)
cmdtest_addcommand(command71, 9, cmdtext)
cmdtest_addcommand(command72, 9, cmdtext)
cmdtest_addcommand(command73, 9, cmdtext)
cmdtest_addcommand(command74, 9, cmdtext)
cmdtest_addcommand(command75, 9, cmdtext)
cmdtest_addcommand(command76, 9, cmdtext)
cmdtest_addcommand(command77, 9, cmdtext)
cmdtest_addcommand(command78, 9, cmdtext)
cmdtest_addcommand(command79, 9, cmdtext)
cmdtest_addcommand(command80, 9, cmdtext)
cmdtest_addcommand(command81, 9, cmdtext)
cmdtest_addcommand(command82, 9, cmdtext)
cmdtest_addcommand(command83, 9, cmdtext)
cmdtest_addcommand(command84, 9, cmdtext)
cmdtest_addcommand(command85, 9, cmdtext)
cmdtest_addcommand(command86, 9, cmdtext)
cmdtest_addcommand(command87, 9, cmdtext)
cmdtest_addcommand(command88, 9, cmdtext)
cmdtest_addcommand(command89, 9, cmdtext)
cmdtest_addcommand(command90, 9, cmdtext)
cmdtest_addcommand(command91, 9, cmdtext)
cmdtest_addcommand(command92, 9, cmdtext)
cmdtest_addcommand(command93, 9, cmdtext)
cmdtest_addcommand(command94, 9, cmdtext)
cmdtest_addcommand(command95, 9, cmdtext)
cmdtest_addcommand(command96, 9, cmdtext)
cmdtest_addcommand(command97, 9, cmdtext)
cmdtest_addcommand(command98, 9, cmdtext)
cmdtest_addcommand(command99, 9, cmdtext)
}
t1 = GetTickCount();
for (i = 0; i < ITERATIONS; i++)
{
if (!CallCommandFunction(playerid, cmdtext))
{
// Send Unknown Command message
}
}
t2 = GetTickCount();
printf("Time 1: %04d, time 2: %04d", t1 - t0, t2 - t1);
}

cmdtest_addpublic(command0)
cmdtest_addpublic(command1)
cmdtest_addpublic(command2)
cmdtest_addpublic(command3)
cmdtest_addpublic(command4)
cmdtest_addpublic(command5)
cmdtest_addpublic(command6)
cmdtest_addpublic(command7)
cmdtest_addpublic(command8)
cmdtest_addpublic(command9)
cmdtest_addpublic(command10)
cmdtest_addpublic(command11)
cmdtest_addpublic(command12)
cmdtest_addpublic(command13)
cmdtest_addpublic(command14)
cmdtest_addpublic(command15)
cmdtest_addpublic(command16)
cmdtest_addpublic(command17)
cmdtest_addpublic(command18)
cmdtest_addpublic(command19)
cmdtest_addpublic(command20)
cmdtest_addpublic(command21)
cmdtest_addpublic(command22)
cmdtest_addpublic(command23)
cmdtest_addpublic(command24)
cmdtest_addpublic(command25)
cmdtest_addpublic(command26)
cmdtest_addpublic(command27)
cmdtest_addpublic(command28)
cmdtest_addpublic(command29)
cmdtest_addpublic(command30)
cmdtest_addpublic(command31)
cmdtest_addpublic(command32)
cmdtest_addpublic(command33)
cmdtest_addpublic(command34)
cmdtest_addpublic(command35)
cmdtest_addpublic(command36)
cmdtest_addpublic(command37)
cmdtest_addpublic(command38)
cmdtest_addpublic(command39)
cmdtest_addpublic(command40)
cmdtest_addpublic(command41)
cmdtest_addpublic(command42)
cmdtest_addpublic(command43)
cmdtest_addpublic(command44)
cmdtest_addpublic(command45)
cmdtest_addpublic(command46)
cmdtest_addpublic(command47)
cmdtest_addpublic(command48)
cmdtest_addpublic(command49)
cmdtest_addpublic(command50)
cmdtest_addpublic(command51)
cmdtest_addpublic(command52)
cmdtest_addpublic(command53)
cmdtest_addpublic(command54)
cmdtest_addpublic(command55)
cmdtest_addpublic(command56)
cmdtest_addpublic(command57)
cmdtest_addpublic(command58)
cmdtest_addpublic(command59)
cmdtest_addpublic(command60)
cmdtest_addpublic(command61)
cmdtest_addpublic(command62)
cmdtest_addpublic(command63)
cmdtest_addpublic(command64)
cmdtest_addpublic(command65)
cmdtest_addpublic(command66)
cmdtest_addpublic(command67)
cmdtest_addpublic(command68)
cmdtest_addpublic(command69)
cmdtest_addpublic(command70)
cmdtest_addpublic(command71)
cmdtest_addpublic(command72)
cmdtest_addpublic(command73)
cmdtest_addpublic(command74)
cmdtest_addpublic(command75)
cmdtest_addpublic(command76)
cmdtest_addpublic(command77)
cmdtest_addpublic(command78)
cmdtest_addpublic(command79)
cmdtest_addpublic(command80)
cmdtest_addpublic(command81)
cmdtest_addpublic(command82)
cmdtest_addpublic(command83)
cmdtest_addpublic(command84)
cmdtest_addpublic(command85)
cmdtest_addpublic(command86)
cmdtest_addpublic(command87)
cmdtest_addpublic(command88)
cmdtest_addpublic(command89)
cmdtest_addpublic(command90)
cmdtest_addpublic(command91)
cmdtest_addpublic(command92)
cmdtest_addpublic(command93)
cmdtest_addpublic(command94)
cmdtest_addpublic(command95)
cmdtest_addpublic(command96)
cmdtest_addpublic(command97)
cmdtest_addpublic(command98)
cmdtest_addpublic(command99)

Y_Less
13/08/2009, 06:08 PM
That is very interesting then. May I suggest you write it up in a release or discussion topic? As I said I use this in YSI however there's a lot of other overhead in there to do with the dynamic commands in it so (especially for small numbers of commands) it's actually slower than the normal system, I never bothered testing without the additional overhead as you've done, so well done.

Zeex
14/08/2009, 01:28 PM
OK, I released it as include at that topic (http://forum.sa-mp.com/index.php?topic=116240.0).

Y_Less
20/08/2009, 07:34 PM
Added a new section on numbers:

http://forum.sa-mp.com/index.php?topic=79810.msg522270#post_infinity

Recycler
30/09/2009, 05:12 PM
I found a minor typo mistake in one of your code examples:

if (gIsACar[model - 400]}

should be, of course:

if (gIsACar[model - 400])

Thanks for maintaining such a great collection of code optimisations. Really helps me developing my scripts :)

Y_Less
01/10/2009, 09:01 AM
Thanks, fixed (also found another minor typo just below it while re-reading that section).

And I'm glad some people are finding it useful.

yezizhu
02/10/2009, 06:24 AM
Rereading this topic, knowning too many I haven't understood before^^
Don't know if I misunderstand it or you make a small mistake.
In types section

2 3 3
1 2 -1
-1 0 2

the last pointer of 3rd num point to it self, should it be

2 3 3
1 2 -1
-1 0 1

?

Y_Less
02/10/2009, 10:27 AM
No, you are correct, it should be 1 - fixed. However it took me a while to confirm that was wrong as I had to read my own explanation several times before even I understood it - that can't be a good sign!

Google63
05/10/2009, 09:09 PM
I think that almost a half or more people doesn't know what is actually stack.( I do :P )
So you can make new section stack and explain it... It is good to know it( and you must very well know it if you do assembly )

Y_Less
06/10/2009, 08:05 AM
I may do, but for now I've added a section on state machines too (or rather a link to my other topic on them).

Edit: Also added something on callback hooks.

Hiitch
08/10/2009, 07:27 PM
Hi Y_Less, I've come to ask you a question regarding these code optimizations.

My question is this, when coding, I get a message like this

Pawn compiler 3.2.3664 Copyright (c) 1997-2006, ITB CompuPhase

Header size: 3012 bytes
Code size: 63692 bytes
Data size: 45104 bytes
Stack/heap size: 16384 bytes; estimated max. usage=4257 cells (17028 bytes)
Total requirements: 128192 bytes


so I talked to my friend about it. He told me that I should put #pragma dynamic 8192. Now out of curiosity, what are the pros and cons of using #pragma dynamic?

I remember on one of your previous topics you mentioned that changing the 256's to 128's would solve this problem, but for me, since I use strcmp, and haven't gone to using DCMD, would I be at a slight dis-advantage?

Y_Less
09/10/2009, 08:18 AM
No, people using 256 are just wasting vast amounts of memory on nothing - the maximum text input/output is 128 so you will never use the other half of your arrays.

Technically people using anything other that 64 cell packed strings are wasting memory too, but that's beside the point and a battle for me for another day.

Y_Less
21/10/2009, 01:41 PM
I've added a mention of the new 0.3 distance checks, but to get it in I had to remove a lot of random bits of spacing because it's so close to the character limit now.

Malice
23/10/2009, 12:32 PM
Hello Y_Less, I have implemented your bit manipulation of variables, but I am confused with 1 thing. Let's say you have set all 32 bits of a cell to 1. How will Pawn interpret this variable? Normally this would make it -1. Maybe bit manipulation is immune to this? I hope you understand what I'm getting at. Thanks.

Y_Less
23/10/2009, 12:35 PM
That is -1, but you don't display or use that value directly. The actual value of the variable is unimportant, it's just the set bits that matter.

Maniek
27/10/2009, 08:34 PM
I heve question about stream object
what is faster:
MidoBan or Double-O-Stream ?

Y_Less
28/10/2009, 10:00 AM
Y_Objects, simple as. I've just been through the code for both the ones you mentioned and neither of them pay any attention to what is said in this topic.

Google63
28/10/2009, 12:29 PM
Hi, how make some function with #emit ?
I have very good understanding of ASM and i seen all opcodes (and what they do in ASM), but can you explain how to make some function...
I do know that in ASM functions are like labels;

example:

shit:
;blablabla
secondfunc:
;blablabla


So my question is; how to make function ONLY with AMX opcodes and is that really possible or I only can #emit specific opcodes ?
Thanks :)

MenaceX^
28/10/2009, 12:35 PM
I heve question about stream object
what is faster:
MidoBan or Double-O-Stream ?
First time I see a question like this where Y_Less' streamer is not involved.
Use Y_Objects.

Added: This isn't even supposed to be asked here.

Y_Less
28/10/2009, 02:26 PM
Hi, how make some function with #emit ?
I have very good understanding of ASM and i seen all opcodes (and what they do in ASM), but can you explain how to make some function...
I do know that in ASM functions are like labels;

example:

shit:
;blablabla
secondfunc:
;blablabla


So my question is; how to make function ONLY with AMX opcodes and is that really possible or I only can #emit specific opcodes ?
Thanks :)


I had a quick play, I've never used #emit but it does present some interesting possibilities:


stock MyEmitFunc()
{
#emit PUSH.pri a
}


That would create a stack corrupting function, defining the functions is just the same as normal. If you can get something like:


new a = 42;
#emit PUSH a
printf("%d");
#emit POP.pri


To work, please tell me how as it always gives the wrong answer for me.

Google63
28/10/2009, 08:49 PM
If you call "printf("%d")" (no opcodes/not working opcodes) default is you got 0, i tried. So it can only mean you must think different solution. It is weird to me .pri & .alt as it isn't well documented anywhere(amx opcodes in general, i searched alot). In ASM for example you know there are CPU registers....

edit: i think you must call targeted function with one of "call" opcodes, i will do some testing on it and will report if I find anything interesting
edit2: i thinked about and find out solution; make function with code you want in PAWN, and call directly pawncc with parameter -a (means it will compile in amx opcodes) so you can see how it process and you can intergrate that into your own ;)

As i said here is results:
amx opcodes:


CODE 0 ; 0
;program exit point
halt 0

proc ; main
; line 4
; line 5
break ; c
push.c 0
call customFunction
retn

proc ; customFunction
; line 9
; line a
break ; 28
;$lcl a fffffffc
push.c 20
;$exp
; line c
break ; 34
push.adr fffffffc
;$par
push.c 0
;$par
push.c 8
sysreq.c 0 ; printf
stack c
;$exp
; line d
break ; 60
const.pri 1
stack 4
retn


DATA 0 ; 0
dump 25 64 0

STKSIZE 1000


pawn code

#include <a_samp>

main()
{
return customFunction();
}

customFunction()
{
new
a = 32;
printf("%d", a);
return 1;
}


And so, you can combine this code into result you want ;)

While i was playing i could reproduce some working code in amx opcodes:

main()
{
printf("\n %d ", customFunction(10, 12));
return 1;
}

customFunction(numberOne, numberTwo)
{
#emit load.s.pri numberOne
#emit load.s.alt numberTwo
#emit smul
#emit retn
}

For "customFunction" compiler returns "no return" warnning because it doesn't search if opcode "retn" is used so you can just ignore that warning, program is tested and working.
Here you can see that you don't need stack's argument memory location ( esp + x ) but specific opcodes are used

Y_Less
29/10/2009, 10:19 AM
I also had quite a play about with this, I've almost got varadic argument passthrough working, so you can do:


va(...)
{
printf("hi", <arguments passed to "va">);
}


But it's not quite working yet and the code is at home, but as I said it's interesting.

Google63
29/10/2009, 10:35 AM
I was looking for execution of natives, but I simply can get them working, here is one thing i don't understand;

Calling in-script functions is pretty easy

main()
{
#emit call someFunction
#emit retn
}
someFunction()
{
/* code */
}


I did succeeded to get half-working "print", but I somehow wrongly pushed string on stack. I don't know where problem in pushing is as I am doing exectly same instruction for push.

Anyway i didn't quite understood how VM can call function with like this:

/* 0 is param named FUNCTION NUMBER*/
#emit sysreq.c 0


But it doesn't have any sense, function numbers are numerical ordered and as far as i can see there isn't any address getting or so. Also i get working call by reference and array creating, but natives are really hard to get working with opcodes :S

Y_Less
29/10/2009, 10:41 AM
You can do:


#emit SYSREQ.C print


But you have to make sure that print is called somewhere else in your script or the reference to it won't be included in the final output. I got a call to random working (and printf, but that was more complex). Off the top of my head I think the code was:


new i = 4
#emit BREAK
#emit PUSH.S i
#emit PUSH.C 4
#emit SYSREQ.C random
#emit STACK 8
#emit STOR.pri i


Which should be equivalent to:


new i = 4;
i = random(i);


But it may not be exactly accurate.

Google63
29/10/2009, 11:19 AM
I got it working, and it is good ;)
I think there must be way to get this without calling it anywhere else but in your function, because this is done by compiler and he probably calculates address or use some of opcode, because it is very shity in this way... I will have more deeper look in opcodes for addresses

Y_Less
29/10/2009, 12:11 PM
No, the number (0 in your example) is an index into a table of function names. If you compile your code with:

#pragma compress 0

And look at the amx in a hex editor you will see a list of all used natives near the bottom (or top, somewhere anyway). Only functions the compiler knows are used are put into this table and SYSREQ.C uses that as a lookup (note that after it's called once it is replaced by a SYSREQ call with a direct address from the table lookup).

Google63
29/10/2009, 12:16 PM
Thanks for nice explanation, anyway did you measured how faster is some code using AMX opcodes? It will be nice to see difference...

Y_Less
29/10/2009, 01:40 PM
No, I've not got far enough to write optimised assembler yet.

Daren_Jacobson
03/11/2009, 04:01 AM
could you explain why strcmp is used and not strequal, i just noticed this function reading String_Manipulation.pdf
saw that strequal isn't included, now i am wondering why that is.

Y_Less
03/11/2009, 09:03 AM
Because that's a later version of PAWN than the one SA:MP uses.

Double-O-Seven
04/11/2009, 11:46 AM
Y_Less? You should check the distance functions with different distances...

[12:39:27] 0: 1466 1384 900 937 1031 1263 1422 1028 1801
[12:39:39] 1: 1414 1515 946 942 1046 1259 1386 996 1841
[12:39:50] 2: 1477 1358 890 924 1027 1216 1356 984 1750
[12:40:01] 3: 1457 1373 899 928 1041 1220 1356 987 1761
[12:40:12] 4: 1359 1298 923 939 1041 1176 1341 1036 1619
[12:40:22] 5: 1453 1316 937 937 1046 1187 1335 1022 1151
[12:40:31] 6: 978 975 917 932 1037 900 998 995 1127
[12:40:40] 7: 979 971 918 927 1041 903 1013 1017 1161
[12:40:49] 8: 1000 1000 936 942 1053 920 998 996 1124
[12:40:58] 9: 977 974 917 927 1037 902 1003 996 1123
[12:41:06] 10: 973 968 921 927 1037 897 995 1000 735
[12:41:13] 11: 626 594 917 926 1036 625 673 998 718
[12:41:20] 12: 626 596 920 933 1039 620 673 999 717
[12:41:27] 13: 631 596 920 928 1040 620 678 997 718
[12:41:35] 14: 626 596 918 932 1037 620 672 997 719
[12:41:42] 15: 630 623 1014 1033 1100 650 695 1025 738
[12:41:50] 16: 659 616 944 990 1197 691 716 1030 765
[12:41:57] 17: 654 614 922 931 1040 623 676 1002 723
[12:42:04] 18: 625 598 925 932 1050 636 673 1022 722
[12:42:11] 19: 641 605 948 941 1181 672 680 1012 725
[12:42:19] 20: 630 598 926 933 1047 624 677 1005 727
[12:42:26] 21: 637 600 932 934 1050 629 677 1003 727
[12:42:33] 22: 633 601 927 935 1051 635 681 1007 722
[12:42:40] 23: 632 601 928 941 1048 625 680 1009 730
[12:42:47] 24: 638 601 927 935 1059 628 679 1002 727 I editted your testscript, added another Type and checked the time.
If real distance is very small, some functions are very slow. But if the distance is high, these functions become very fast.

Check this out: Pastebin (http://pastebin.com/m637158c)

Y_Less
04/11/2009, 01:03 PM
Hmm, looking at your code has revealed a HUGE bug in my original code:

printf("%d %d %d %d %d %d %d %d %d", time1 - time0, time2 - time1, time3 - time2, time4 - time3, time5 - time4, time6 - time5, time7 - time6, time8 - time7);

There are 9 "%d"s and only 8 times, which means the "45" I got for the last one, which I read as the result for type 8, isn't - bugger!

Y_Less
05/11/2009, 07:25 PM
OK, I've updated the first post correcting my mistake (I should have checked the original results, but I thought the huge discrepancy was caused by function calling overheads). Don't use the #define distance check, use the optimised function (I wonder if I can optimise it further with emit assembly, deserves testing but pointless for 0.3 as I'm not going to beat the native function).

Edit: Double-O-Seven those functions vary because they can take advantage of short circuiting. The reason they're so much faster with larger numbers is because they fail faster because the x is already too large. If you have an x and y co-ordinate which is in range, but not a z they will take longer (try only multiplying your z in the code) - and they take far longer when you are in range. The distance check functions are more accurate AND operate in constant time no matter where you are relative to the point. This does raise an interesting optimisation question - do you use a slower function which is faster when you are far away, assuming that the player is more likely to be far away than close most of the time, or do you use a function which is always the same speed?

If you look at your results there are three distinct groups:

The first group (tests 0-5) are when the point is in range, or the Z is out of range, so every test is run, for this set the area check functions are all around 1500ms.

The second group (tests 6-10) are when the Y co-ordinate is out of range (as it increases the fastest), so the code needs to run the X and Y tests, but not the last ones, for this set the area check functions are in the order of 900ms each (approximately equal to the constant time of the range check functions - a good argument for the area checks).

The third group is when the X co-ordinate (i.e. the first axis checked) is out of range. Due to the execution order and short circuiting this means that Y and Z are never checked, saving time, and these are the results in the order of 600ms (tests 11+)

Of course, as I've now said the point is moot in 0.3 as there's no denying that the native function is the fastest by a long way all the time.

yom
08/11/2009, 01:33 PM
I'm interested by #emit, but i just can't understand it.

This small script:

native print(s[]);

main()
print("Hello World");


Disassembled (with pawndisasm.exe - can be found in the pawn toolkit), it gives this:

;File version: 8
;Flags: compact-encoding

00000000 halt 00000000

00000008 proc
0000000c break
00000010 push.c 00000000
00000018 push.c 00000004
00000020 sysreq.c 00000000
00000028 stack 00000008
00000030 zero.pri
00000034 retn

;DATA
00000000 00000048 00000065 0000006c 0000006c H e l l
00000010 0000006f 00000020 00000057 0000006f o W o
00000020 00000072 0000006c 00000064 00000000 r l d


Now how to reproduce this with #emit? Especially the DATA part.

Also i don't think it will be faster than normal pawn code, since once compiled it's similar (or no?), i just would like to know because i'm curious.

Y_Less
08/11/2009, 02:31 PM
You don't emit the data section, however I've not quite figured out how to emit static strings like that yet. However I can break down what that's doing for you and what it would look like in PAWN:


00000000 halt 00000000 ; Ends the program (the final return jumps here)

00000008 proc ; This is the start of the function, not needed in your code
0000000c break ; Seems to delimit separate operations
00000010 push.c 00000000 ; Push the address of the string onto the stack
00000018 push.c 00000004 ; Push the amount of memory pushed onto the stack
00000020 sysreq.c 00000000; Call the native defined at index 0 of the script local array (print as you only use one)
00000028 stack 00000008 ; Reduce the stack by 8 to remove the 8 bytes you pushed
00000030 zero.pri ; Blank the primary register (to return 0)
00000034 retn ; End the current function


In PAWN using #emit this would look something* like:


native print(s[]);

main()
{
#emit BREAK
#emit PUSH.C 0
#emit PUSH.C 4
#emit SYSREQ.C 0
#emit STACK.C 0
}


*It won't look exactly like this as because you don't call print directly, it won't be included in the file output, so this code will try to call a function which doesn't exist to it. this will also not push the string properly.

As for why you would do it, the PAWN compiler is not very optimising, so there are a number of optimisations you can make if you code by hand. For example:


native print(s[]);

main()
{
print("Hello World");
print("Hello World");
print("Hello World");
print("Hello World");
print("Hello World");
}


That will produce the code you had (or most of it) 5 times, wheras if you coded it by hand you could optimise out the extra pushes and pops to produce (note that this isn't tested):


native print(s[]);

main()
{
#emit BREAK
#emit PUSH.C 0
#emit PUSH.C 4
#emit SYSREQ.C 0
#emit SYSREQ.C 0
#emit SYSREQ.C 0
#emit SYSREQ.C 0
#emit SYSREQ.C 0
#emit STACK.C 0
}


So each call gets exactly the same stack.

yom
09/11/2009, 07:35 PM
Thanks, that's really interesting, it looks a bit complicated but i will study that in details. :)

Showman
10/11/2009, 10:28 AM
Very Good !

Google63
14/11/2009, 08:03 PM
Note:
Here is how PAWN compiler calculates stack size for native call:


outval((numargs+1)*sizeof(cell),TRUE,TRUE);


or simplified

((number or arguments function got) + 1) * 4


print's stack size allocated:

1+1*4 = 8


cell's size depends on CPU's arhitecture(4bytes = 32bit, etc.)

Y_Less
15/11/2009, 04:24 AM
Anyone figured out how to make a pass-through printf like function yet? As in:


myprintf(format[], {Float,_}:...)
{
printf(format, ...);
}


I know it's possible with this, I've just not had chance to look into it yet. And this is one of the grails of SA:MP coding as it means you can do things like:

SendClientFormattedMessage(playerid, color, msg[], {Float,_}:...)

Properly, without odd looking macros (although those do work, but there can be advantages to doing it this way).

Daren_Jacobson
15/11/2009, 11:50 PM
it is really that simple? so then my iFormat function should work like this.

iFormat(const string[], {Float,_}:...)
{
new output[256];
format(output, 256, string, ...);
return output;
}

now I need to go test it.

Damian
16/11/2009, 12:19 AM
it is really that simple? so then my iFormat function should work like this.

iFormat(const string[], {Float,_}:...)
{
new output[256];
format(output, 256, string, ...);
return output;
}

now I need to go test it.

It's not that simple which is why he's asking.

Y_Less
16/11/2009, 06:29 AM
No, you need to use custom assembly to re push all the passed parameters, or at least selected ones. However I think it may be easier than I thought as you don't need to push string literals, just string references, which was the bit I wasn't sure about, but that should be no harder than pushing any other regular parameter.

The code I gave there was just to demonstrate what I was trying to do, not how to do it.

Giacomand
16/11/2009, 07:18 AM
Anyone figured out how to make a pass-through printf like function yet? As in:


myprintf(format[], {Float,_}:...)
{
printf(format, ...);
}


I know it's possible with this, I've just not had chance to look into it yet. And this is one of the grails of SA:MP coding as it means you can do things like:

SendClientFormattedMessage(playerid, color, msg[], {Float,_}:...)

Properly, without odd looking macros (although those do work, but there can be advantages to doing it this way).


I asked a question like that once, and you responded by saying to use defines.

I think I asked about it in the co.uk forums, I can't seem to find it.

Damian
16/11/2009, 07:40 AM
No, you need to use custom assembly to re push all the passed parameters, or at least selected ones. However I think it may be easier than I thought as you don't need to push string literals, just string references, which was the bit I wasn't sure about, but that should be no harder than pushing any other regular parameter.

The code I gave there was just to demonstrate what I was trying to do, not how to do it.

Well if you get it done please post, I've been trying to find out for a while.

Giacomand
16/11/2009, 08:02 AM
I really do remember someone giving it to me and it worked... I'm gonna search my posts again..

Y_Less
16/11/2009, 01:02 PM
It can be done with defines, I've done it repeatedly, but that's not a function.

Giacomand
28/11/2009, 01:45 AM
http://forum.sa-mp.com/index.php?topic=101773.0 ?

Daren_Jacobson
28/11/2009, 05:06 AM
concerning bit-wise manipulation, how would i see a bit if other bits were already set, += seems to blank the other bits out first, just setting that one. I am doing this just to compact some player variables, and ran into troubles.

Maniek
28/11/2009, 09:50 AM
Delete please.

MenaceX^
28/11/2009, 10:00 AM
No, you need to use custom assembly to re push all the passed parameters, or at least selected ones. However I think it may be easier than I thought as you don't need to push string literals, just string references, which was the bit I wasn't sure about, but that should be no harder than pushing any other regular parameter.

The code I gave there was just to demonstrate what I was trying to do, not how to do it.

Well if you get it done please post, I've been trying to find out for a while.

#define SendFormattedMessage(%0,%1,%2) \
do \
{ \
new sendfstring[128]; \
format(sendfstring,128,%2); \
SendClientMessage(%0,%1,sendfstring); \
}\
while(FALSE)

new
FALSE=false;

the false part is so the loop won't return itself.

Google63
28/11/2009, 11:06 AM
No, you need to use custom assembly to re push all the passed parameters, or at least selected ones. However I think it may be easier than I thought as you don't need to push string literals, just string references, which was the bit I wasn't sure about, but that should be no harder than pushing any other regular parameter.

The code I gave there was just to demonstrate what I was trying to do, not how to do it.

Well if you get it done please post, I've been trying to find out for a while.

#define SendFormattedMessage(%0,%1,%2) \
do \
{ \
new sendfstring[128]; \
format(sendfstring,128,%2); \
SendClientMessage(%0,%1,sendfstring); \
}\
while(FALSE)

new
FALSE=false;

the false part is so the loop won't return itself.

We were talking about opcodes and not about simple macros, anyone can do it in macro-way...

Backwardsman97
04/12/2009, 10:09 PM
I ran a speed test to see enum vs. arrays and the array was faster. I figured it would be, but it was only by a small bit. It's just more organized to do enum's but is it worth it being slightly slower?

Here's what I tested with 10000 iterations.

printf("%d, %d, %d",PInfo[0][stuff],PInfo[0][some],PInfo[0][more]);
//VS
printf("%d, %d, %d",stuff2[0],some2[0],more2[0]);


Here's the results.

[16:04:28] Time 1: 1636, time 2: 1459

Simon
04/12/2009, 11:52 PM
I don't believe the enum is what slowed down that code, as enums are simply just pre-processed constants AFAIK. I believe what slowed down that code is that you have a multi-dimensional array vs a single dimension array.

Zamaroht
06/12/2009, 07:14 AM
I've just finished reading it, it made me a lot of things clearer, although I still have to learn about bitwise operations.
Thanks for this, amazing work!

By the way, I noticed you didn't mention the usage of booleans variables in the Memory reduction section.
Taking a look at the example code you posted there:


new
gIsACar[] = {1, 0, 0, 1, 1, 0, 1, 0, 1, 1},
gIsAHeavyVehicle[] = {1, 0, 1, 0, 0, 0, 0, 0, 1, 0},
gIsABoat[] = {0, 1, 0, 0, 0, 0, 0, 1, 0, 0},
gIsAFireEngine[] = {0, 0, 1, 0, 0, 1, 0, 0, 0, 0};


Another way to improve it (which would be a lot more readeable in my opinion) could be:

new
bool:gIsACar[] = {true, false, false, true, true, false, true, false, true, true},
bool:gIsAHeavyVehicle[] = {true, false, true, false, false, false, false, false, true, false},
bool:gIsABoat[] = {false, true, false, false, false, false, false, true, false, false},
bool:gIsAFireEngine[] = {false, false, true, false, false, true, false, false, false, false};


That would still reduce the memory consumption a lot.

Zeex
06/12/2009, 09:54 AM
bool: doesn't reduce memory usage, it's just a tag for those variables which should be only true (bool:1) of false (bool:0) but actually they are allowed to hold values from -2,147,483,648 to 2,147,483,647.

So your second example is the same as the first one...

Zamaroht
06/12/2009, 06:24 PM
bool: doesn't reduce memory usage, it's just a tag for those variables which should be only true (bool:1) of false (bool:0) but actually they are allowed to hold values from -2,147,483,648 to 2,147,483,647.

So your second example is the same as the first one...


I thought they also reduced the memory consumption of the variable. Thanks for explaining.

Backwardsman97
07/12/2009, 02:50 AM
I don't believe the enum is what slowed down that code, as enums are simply just pre-processed constants AFAIK. I believe what slowed down that code is that you have a multi-dimensional array vs a single dimension array.


Well that's how a lot of people use it.

yezizhu
07/12/2009, 11:56 PM
I don't believe the enum is what slowed down that code, as enums are simply just pre-processed constants AFAIK. I believe what slowed down that code is that you have a multi-dimensional array vs a single dimension array.


Well that's how a lot of people use it.

Seems you're right.
We can

stock getPlrPassword(playerid){
return gpInfo_password;
}

So no one need to know what's the orginal type of this variable.

iLinx
08/12/2009, 12:05 AM
bool: doesn't reduce memory usage, it's just a tag for those variables which should be only true (bool:1) of false (bool:0) but actually they are allowed to hold values from -2,147,483,648 to 2,147,483,647.

So your second example is the same as the first one...


I thought they also reduced the memory consumption of the variable. Thanks for explaining.


a bool variable uses less than an int, it only needs to store either a 1 or a 0, meaning it will only take up a byte in memory (seeing as memory is stored in groupings of 4, a byte is the lowest amount of memory usage you can have on ram)
a bool does reduce memory usage, it only uses 1 byte, a normal int variable uses 8 bytes (iirc)

yom
08/12/2009, 12:41 AM
bool: doesn't reduce memory usage, it's just a tag for those variables which should be only true (bool:1) of false (bool:0) but actually they are allowed to hold values from -2,147,483,648 to 2,147,483,647.

So your second example is the same as the first one...


I thought they also reduced the memory consumption of the variable. Thanks for explaining.


a bool variable uses less than an int, it only needs to store either a 1 or a 0, meaning it will only take up a byte in memory (seeing as memory is stored in groupings of 4, a byte is the lowest amount of memory usage you can have on ram)
a bool does reduce memory usage, it only uses 1 byte, a normal int variable uses 8 bytes (iirc)


Not in PAWN, please read the manual...

Zeex
08/12/2009, 10:27 AM
bool: doesn't reduce memory usage, it's just a tag for those variables which should be only true (bool:1) of false (bool:0) but actually they are allowed to hold values from -2,147,483,648 to 2,147,483,647.

So your second example is the same as the first one...


I thought they also reduced the memory consumption of the variable. Thanks for explaining.


a bool variable uses less than an int, it only needs to store either a 1 or a 0, meaning it will only take up a byte in memory (seeing as memory is stored in groupings of 4, a byte is the lowest amount of memory usage you can have on ram)
a bool does reduce memory usage, it only uses 1 byte, a normal int variable uses 8 bytes (iirc)


You guys should read the first post a bit better.... PAWN IS TYPELESS, all variables are made of cells, one cell is usually 4 bytes. Those bool:, Float:, Menu:, etc. are just tags which can be used to indicate variable's special meaning or to define where it can be used or whatever else.
You can read more about tags on the wiki or pawn-lang.pdf.

Backwardsman97
09/12/2009, 02:48 AM
I thought the bool tag was built in and the menu tag was made by SA-MP? Are you saying they have the same purpose?

yom
09/12/2009, 04:03 AM
A tag is a label that denotes the objective of—or the meaning of—a variable, a constant or a function result. Tags are optional, their only purpose is to allow a stronger compile-time error checking of operands in expressions, of function arguments and of array indices.


This mean you can do:

new f = 1.2;
printf("%f", f);

this will obviously throw a tag mistach warning because f should be tagged with Float:, but will still work perfectly fine. If you ever want to disable those tag mismach warnings, you can compile with option -w213.

Backwardsman97
09/12/2009, 04:19 AM
So do you save any memory by using the bool tag when you just want it to equal 1 or 0?

Simon
09/12/2009, 04:42 AM
No. Pawn only has one data type, that is the "cell". Tags are just a warning system. If you want to save memory then use bitwise manipulation on your variables which is demonstrated in the opening posts.



I don't believe the enum is what slowed down that code, as enums are simply just pre-processed constants AFAIK. I believe what slowed down that code is that you have a multi-dimensional array vs a single dimension array.


Well that's how a lot of people use it.


Yes, but it doesn't mean that using enums is slower. You can make single dimensional arrays using enums and enums can also be used to make tags, enums are just a smart way for the compiler to generate numbers and there usage shouldn't effect running code.

Backwardsman97
10/12/2009, 12:41 AM
No. Pawn only has one data type, that is the "cell". Tags are just a warning system. If you want to save memory then use bitwise manipulation on your variables which is demonstrated in the opening posts.



I don't believe the enum is what slowed down that code, as enums are simply just pre-processed constants AFAIK. I believe what slowed down that code is that you have a multi-dimensional array vs a single dimension array.


Well that's how a lot of people use it.


Yes, but it doesn't mean that using enums is slower. You can make single dimensional arrays using enums and enums can also be used to make tags, enums are just a smart way for the compiler to generate numbers and there usage shouldn't effect running code.


I aways thought I was saving memory by using a bool tag or something or else I wouldn't have gone through the trouble. I wanna learn more about bitwise operators and how to use them. Can anyone point me in the right direction? I used google and it did help but I just would like one of your opinions.

Google63
10/12/2009, 01:06 AM
With stuff like that whole point is pratice. You can't learn by just reading definitions and stuff. See some examples and try for yourself.

Jason_Gregory
11/12/2009, 09:53 AM
Why is there anything about packed strings yet ?

yom
11/12/2009, 10:02 AM
There is: http://forum.sa-mp.com/index.php?topic=78026.0#post_packed

Y_Less
14/12/2009, 01:34 PM
I've just been playing about with some code rearrangements for another project and have developed a faster distance check. Note that this is still slower than the 0.3 native function when done in PAWN, but that's because of language differences - this is a better algorithm under certain circumstances and if the languages were the same it should be faster (although I compared it to the fastest PAWN version and it's not much faster). Basically I did some rearrangements, expanding and continuing the rearrangements in the first post:


Original formula:
(x * x) + (y * y) + (z * z) <= 5.0 * 5.0

With real locations:
((x1 - x2) * (x1 - x2)) + ((y1 - y2) * (y1 - y2)) + ((z1 - z2) * (z1 - z2)) <= (r * r)

Expanding the brackets:
((x1 * x1) + (x2 * x2) - (2 * x1 * x2)) + ((y1 * y1) + (y2 * y2) - (2 * y1 * y2)) + ((z1 * z1) + (z2 * z2) - (2 * z1 * z2)) <= (r * r)

Divide by 2:
((x1 * x1 / 2) + (x2 * x2 / 2) - (x1 * x2)) + ((y1 * y1 / 2) + (y2 * y2 / 2) - (y1 * y2)) + ((z1 * z1 / 2) + (z2 * z2 / 2) - (z1 * z2)) <= (r * r) / 2

Group terms:
(x1 * x1 + y1 * y1 + z1 * z1) / 2 + (x2 * x2 + y2 * y2 + z2 * z2) / 2 - (x1 * x2 + y1 * y2 + z1 * z2) <= (r * r) / 2


Now it's not very clear from that last formula, but there are a few important things to note here:


(x1 * x1 + y1 * y1 + z1 * z1) / 2


This ONLY uses information from one point (i.e. a player OR an object, not both). Regardless of the location of the other point, this is constant. I'll explain the significance of this in a bit.


(x2 * x2 + y2 * y2 + z2 * z2) / 2


This is constant for the other point, same as with the first group.


(x1 * x2 + y1 * y2 + z1 * z2)


This is the only part of the formula which relies on both points, and is much simpler to calculate than the original formula.

Let's take an insanely simple object streamer:


#define MAX_OBJECTS (200)

#define VIEW_DISTANCE (300)

enum E_OBJECT
{
E_OBJECT_MODEL,
Float:E_OBJECT_X,
Float:E_OBJECT_Y,
Float:E_OBJECT_X
}

static
gObjects[MAX_OBJECTS][E_OBJECT],
gIndex = 0;

stock
AddObject(model, Float:x, Float:y, Float:z)
{
if (gIndex < MAX_OBJECTS)
{
gObjects[gIndex][E_OBJECT_MODEL] = model;
gObjects[gIndex][E_OBJECT_X] = x;
gObjects[gIndex][E_OBJECT_Y] = y;
gObjects[gIndex][E_OBJECT_Z] = z;
++gIndex;
}
}

static
DistanceCheck(Float:x1, Float:y1, Float:z1, Float:x2, Float:y2, Float:z2, Float:dist)
{
x1 -= x2;
x1 *= x1;
y1 -= y2;
y1 *= y1;
z1 -= z2;
z1 *= z1;
return (x1 + y1 + z1) <= dist;
}

public
Object_Loop()
{
new
Float:x,
Float:y,
Float:z;
foreach (Player, playerid)
{
GetPlayerPos(playerid, x, y, z);
for (new i = 0; i != gIndex; ++i)
{
if (DistanceCheck(x, y, z, gObjects[i][E_OBJECT_X], gObjects[i][E_OBJECT_Y], gObjects[i][E_OBJECT_Z], VIEW_DISTANCE * VIEW_DISTANCE))
{
CreatePlayerObject(playerid, gObjects[i][E_OBJECT_MODEL], gObjects[i][E_OBJECT_X], gObjects[i][E_OBJECT_Y], gObjects[i][E_OBJECT_Z], 0.0, 0.0, 0.0);
}
}
}
}


Ignore for now the major problems with this code, in that no objects are ever destroyed, it's just an illustration. We can optimise this code by using our new formula and some pre-calculated values:


#define MAX_OBJECTS (200)

#define VIEW_DISTANCE (300)

enum E_OBJECT
{
E_OBJECT_MODEL,
Float:E_OBJECT_X,
Float:E_OBJECT_Y,
Float:E_OBJECT_X,
Float:E_OBJECT_SQUARE
}

static
gObjects[MAX_OBJECTS][E_OBJECT],
gIndex = 0;

stock
AddObject(model, Float:x, Float:y, Float:z)
{
if (gIndex < MAX_OBJECTS)
{
gObjects[gIndex][E_OBJECT_MODEL] = model;
gObjects[gIndex][E_OBJECT_X] = x;
gObjects[gIndex][E_OBJECT_Y] = y;
gObjects[gIndex][E_OBJECT_Z] = z;
gObjects[gIndex][E_OBJECT_SQUARE] = ((x * x) + (y * y) + (z * z)) / 2;
++gIndex;
}
}

static
DistanceCheck(Float:s1, Float:s2, Float:x1, Float:y1, Float:z1, Float:x2, Float:y2, Float:z2, Float:dist)
{
return (s1 + s2 - (x1 * x2) - (y1 * y2) - (z1 * z2)) <= dist;
}

public
Object_Loop()
{
new
Float:x,
Float:y,
Float:z,
Float:square;
foreach (Player, playerid)
{
GetPlayerPos(playerid, x, y, z);
square = ((x * x) + (y * y) + (z * z)) / 2;
for (new i = 0; i != gIndex; ++i)
{
if (DistanceCheck(square, gObjects[i][E_OBJECT_SQUARE], x, y, z, gObjects[i][E_OBJECT_X], gObjects[i][E_OBJECT_Y], gObjects[i][E_OBJECT_Z], (VIEW_DISTANCE * VIEW_DISTANCE) / 2))
{
CreatePlayerObject(playerid, gObjects[i][E_OBJECT_MODEL], gObjects[i][E_OBJECT_X], gObjects[i][E_OBJECT_Y], gObjects[i][E_OBJECT_Z], 0.0, 0.0, 0.0);
}
}
}
}


Now this code looks a lot more complicated (and is), but because certain parts of the calculation are done outside the loops, they are only done once instead of every time. A third of the calculation is only done once per object, instead of once per object per player per call - which could be hundreds of thousands of times if your server is running for a long time. Another third is done once per player per call, again instead of once per object per player per call - not quite as impressive a reduction, but still good. The remaining calculation is then also faster than the original - so the longer you run your server, the more of an improvement you see.

However as I said this is still slower than the native version, so if you want to use this you will need to implement the new DistanceCheck function in a plugin (and maybe the initialisation functions too (i.e. the ones which generate the squares divided by 2)).

skAiGZoR
26/12/2009, 12:30 AM
Hey man that's very cool thanks a lot :D

Zeex
26/12/2009, 01:01 AM
Why they unsticked this topic? :o

Nero_3D
26/12/2009, 01:17 AM
Why they unsticked this topic? :o


Maybe because sticky topics are mostly for the Little Cluckers
But that topic isnt rly important for they

Simon
29/12/2009, 02:09 AM
Can we please have this re-stickied? Brilliant topic that should be viewed by all.

yezizhu
30/12/2009, 10:19 AM
Can we please have this re-stickied? Brilliant topic that should be viewed by all.

Don't think so, if someone are willing to learn more, they can easily find this topic.
In another way, the lazy newbie won't look into this topic althogh it is sticked

StrickenKid
23/01/2010, 06:06 PM
It should be re-stickied.

Masj
26/01/2010, 09:20 PM
yea, i had to find for 10 minutes this thread ... ....

BP13
27/01/2010, 02:33 AM
yea, i had to find for 10 minutes this thread ... ....


Search Button - 1 second.

Coicatak
30/01/2010, 10:34 PM
So if I'm not wrong the following function will return the distance between a player and a point, right?
Float:GetDistanceToPoint(playerid,Float:x,Float:y, Float:z)
{
new
Float:px,
Float:py,
Float:pz;
GetPlayerPos(playerid, px, py, pz);
px -= x;
py -= y;
pz -= z;
return (px * px) + (py * py) + (pz * pz);
}

Y_Less
30/01/2010, 11:43 PM
You are wrong - that will return the distance between them, squared. I.e. you would need to square root that value to get the true distance, but that's only a problem if you actually need the true distance - which you very rarely do. The first post has a large section on distance checks and other distance functions - I suggest you re-read it. Even if you want to do something like find the closest player to something you don't need the real distance, you can compare the squared values and get exactly the same result faster.

Toribio
31/01/2010, 02:16 PM
For example, when you don't need to show (or compare with a sqroot value) the distance beetween the points, it's better to use the squared fuction:

stock Float:GetPlayerDistanceToPointSquared(playerid, Float:x, Float:y, Float:z)
{
new Float:pX, Float:pY, Float:pZ;
GetPlayerPos(playerid, pX, pY, pZ);
pX -= x, pY -= y, pZ -= z;
return ((pX * pX) + (pY * pY) + (pZ * pZ));
}

Here, you didn't use sqroot in the function, so the return value are squared, then I can use this, for example:

stock GetClosestPlayer(playerid)
{
new
Float:x, Float:y, Float:z,
Float:dist = FLOAT_INFINITY,
plid = INVALID_PLAYER_ID;

foreach(Property, p)
{
GetPlayerPos(p, x, y, z);
new Float:tdist = GetPlayerDistanceToPointSquared(playerid, x, y, z);
if(tdist < dist)
{
dist = tdist;
plid = p;
}
}
return plid;
}

And not for this use:

if(GetPlayerDistanceToPointSquared(playerid, x, y, z) <= 10.0)

In this case, you have to square the compared value too (10.0^2 = 10 * 10):

if(GetPlayerDistanceToPointSquared(playerid, x, y, z) <= 10.0 * 10.0)

Or simply using a function witch not return the squared value.

Coicatak
31/01/2010, 07:24 PM
Ok I understand. Thanks.

Y_Less
01/02/2010, 11:52 AM
In fact I wrote about a further optimisation to that code on the previous page:

http://forum.sa-mp.com/index.php?topic=79810.msg838820#msg838820

Note that most of the time you should use the SA:MP natives (assuming you're using 0.3) - even if they were badly written to use sqroot (which they're not) they would be faster than any PAWN version.

Zeex
06/02/2010, 10:12 AM
I also had quite a play about with this, I've almost got varadic argument passthrough working, so you can do:


va(...)
{
printf("hi", <arguments passed to "va">);
}


But it's not quite working yet and the code is at home, but as I said it's interesting.


Hi! I was playing with that last few days and finally got it working :D
If you're interested, here is my code for SendClientMessageFormmatted:


stock SendClientMessageFormatted(playerid, color, fstring[], {Float, _}:...)
{
new n = numargs() * 4;

if (n == 3 * 4)
{
return SendClientMessage(playerid, color, fstring);
}
else
{
new message[128];
new arg_start;
new arg_end;
new i = 0;

#emit CONST.pri fstring
#emit ADD.C 0x4
#emit STOR.S.pri arg_start // first parameters's offset

#emit LOAD.S.pri n
#emit ADD.C 0x8
#emit STOR.S.pri arg_end // last parameters's offset

// pushing variable arguments
for (i = arg_end; i >= arg_start; i -= 4)
{
#emit LCTRL 5
#emit LOAD.S.alt i
#emit ADD
#emit LOAD.I
#emit PUSH.pri
}
// pushing normal arguments
#emit PUSH.S fstring // format string
#emit PUSH.C 128 // sizeof(message)
#emit PUSH.ADR message // the string which format() will write in
#emit PUSH.S n // number of arguments * 4, always must be passed for natives
#emit SYSREQ.C format

// clearing the stack
i = n / 4 + 1;
while (--i >= 0)
{
#emit STACK 0x4
}

return SendClientMessage(playerid, color, message);
}
}


It works pretty well for me, but there is a little problem - compliler just will crash if you don't use format() anywhere else in your script. I haven't found a solution for this yet...

Also, the script which I tested it on: http://zeex.pastebin.ca/1787920

Y_Less
06/02/2010, 04:37 PM
Very nice! Well done! I like the method you used to get the offsets for the first and last variable parameters, however, as I'm sure you know, your example is very hard coded to the number of parameters, and relies on the fact that your function takes the same number of parameters as "format". I've tried to rewrite it a little more generically, I hope you don't mind:



// HUGE credit to ZeeX for finally cracking this problem:
// http://forum.sa-mp.com/index.php?topic=79810.msg901721#msg901721
stock CPF(playerid, color, fstring[], {Float, _}:...)
{
// This is the number of parameters which are not variable that are passed
// to this function (i.e. the number of named parameters).
static const
STATIC_ARGS = 3;
// Get the number of variable arguments.
new
n = (numargs() - STATIC_ARGS) * BYTES_PER_CELL;
if (n)
{
new
message[128],
arg_start,
arg_end;

// Load the real address of the last static parameter. Do this by
// loading the address of the last known static parameter and then
// adding the value of [FRM].
#emit CONST.alt fstring
#emit LCTRL 5
#emit ADD
#emit STOR.S.pri arg_start

// Load the address of the last variable parameter. Do this by adding
// the number of variable parameters on the value just loaded.
#emit LOAD.S.alt n
#emit ADD
#emit STOR.S.pri arg_end

// Push the variable arguments. This is done by loading the value of
// each one in reverse order and pushing them. I'd love to be able to
// rewrite this to use the values of pri and alt for comparison,
// instead of having to constantly load and reload two variables.
do
{
#emit LOAD.I
#emit PUSH.pri
arg_end -= BYTES_PER_CELL;
#emit LOAD.S.pri arg_end
}
while (arg_end > arg_start);

// Push the static format parameters.
#emit PUSH.S fstring
#emit PUSH.C 128
#emit PUSH.ADR message

// Now push the number of arguments passed to format, including both
// static and variable ones and call the function.
n += BYTES_PER_CELL * 3;
#emit PUSH.S n
#emit SYSREQ.C format

// Remove all data, including the return value, from the stack.
n += BYTES_PER_CELL;
#emit LCTRL 4
#emit LOAD.S.alt n
#emit ADD
#emit SCTRL 4

return SendClientMessage(playerid, color, message);
//return print(message);
}
else
{
return SendClientMessage(playerid, color, fstring);
//return print(fstring);
}
}


As you can see, there are essentially two overlapped functions in here. One prints the data, the other sends it to a player. These two are there to demonstrate using 1 or 3 static parameters. I've also reduced the number of variables (I almost remove arg_start by storing the value purely in "alt", but it didn't like it and I can't really be bothered to improve it more. I also slightly altered the loop to make it faster by putting the [FRM] addition outside.

For reference this is the code I was working on:


#emit CONST.alt fstring
#emit LCTRL 5
#emit ADD
#emit STOR.S.pri arg_end

#emit LOAD.S.alt n
#emit ADD
#emit LOAD.S.alt arg_end
#emit STOR.S.pri arg_end

CPF_loop_label:
{
#emit LOAD.I
#emit PUSH.pri
arg_end -= BYTES_PER_CELL;
#emit LOAD.S.pri arg_end
#emit JNEQ CPF_loop_label
}


VERY LATE EDIT:

I think the reason that second chunk of code never worked is that labels add a few more commands to set the stack to the correct state for that part of the script, even if the stack hasn't changed.

Sergei
08/02/2010, 12:13 AM
Is this CPF function ready for public use? I'm interested to use it, but I don't want to have any side effects from it if you know what I mean.

Daren_Jacobson
08/02/2010, 01:10 AM
So I edited it to make it just return a formatted string, rather than put it in another variable.


// HUGE credit to ZeeX for finally cracking this problem:
// http://forum.sa-mp.com/index.php?topic=79810.msg901721#msg901721
stock iFormat(len, format[], {Float, _}:...)
{
// This is the number of parameters which are not variable that are passed
// to this function (i.e. the number of named parameters).
static const
STATIC_ARGS = 2;
// Get the number of variable arguments.
new
n = (numargs() - STATIC_ARGS) * BYTES_PER_CELL;
if (n)
{
new
message[256], // 256 to be safe.
arg_start,
arg_end;

// Load the real address of the last static parameter. Do this by
// loading the address of the last known static parameter and then
// adding the value of [FRM].
#emit CONST.alt fstring
#emit LCTRL 5
#emit ADD
#emit STOR.S.pri arg_start

// Load the address of the last variable parameter. Do this by adding
// the number of variable parameters on the value just loaded.
#emit LOAD.S.alt n
#emit ADD
#emit STOR.S.pri arg_end

// Push the variable arguments. This is done by loading the value of
// each one in reverse order and pushing them. I'd love to be able to
// rewrite this to use the values of pri and alt for comparison,
// instead of having to constantly load and reload two variables.
do
{
#emit LOAD.I
#emit PUSH.pri
arg_end -= BYTES_PER_CELL;
#emit LOAD.S.pri arg_end
}
while (arg_end > arg_start);

// Push the static format parameters.
#emit PUSH.S format
#emit PUSH.C 256
#emit PUSH.ADR message

// Now push the number of arguments passed to format, including both
// static and variable ones and call the function.
n += BYTES_PER_CELL * 3;
#emit PUSH.S n
#emit SYSREQ.C format

// Remove all data, including the return value, from the stack.
n += BYTES_PER_CELL;
#emit LCTRL 4
#emit LOAD.S.alt n
#emit ADD
#emit SCTRL 4

}
return message;
}


hehe, I am so excited about what I can do with this.

[HUN]Gamestar
14/02/2010, 12:09 PM
BYTES_PER_CELL?

Toribio
14/02/2010, 03:33 PM
BYTES_PER_CELL?


Each cell contains 4 bytes, but it's a SA:MP choice...

[HUN]Gamestar
14/02/2010, 09:28 PM
I'm noob,but...error 017: undefined symbol "BYTES_PER_CELL"
#define BYTES_PER_CELL 4? xD = Memstack Error...

Daren_Jacobson
15/02/2010, 03:29 AM
Lol, I was thinking bits per cell, in my code i put 32 ><

Y_Less
15/02/2010, 11:01 AM
I'm noob,but...error 017: undefined symbol "BYTES_PER_CELL"
#define BYTES_PER_CELL 4? xD = Memstack Error...



The rest of your code would be helpful.

[HUN]Gamestar
15/02/2010, 02:28 PM
//debug

#include a_samp

#define BYTES_PER_CELL 4

stock CPF(playerid, color, fstring[], {Float, _}:...)
{
// This is the number of parameters which are not variable that are passed
// to this function (i.e. the number of named parameters).
static const
STATIC_ARGS = 3;
// Get the number of variable arguments.
new
n = (numargs() - STATIC_ARGS) * BYTES_PER_CELL;
if (n)
{
new
message[128],
arg_start,
arg_end;

// Load the real address of the last static parameter. Do this by
// loading the address of the last known static parameter and then
// adding the value of [FRM].
#emit CONST.alt fstring
#emit LCTRL 5
#emit ADD
#emit STOR.S.pri arg_start

// Load the address of the last variable parameter. Do this by adding
// the number of variable parameters on the value just loaded.
#emit LOAD.S.alt n
#emit ADD
#emit STOR.S.pri arg_end

// Push the variable arguments. This is done by loading the value of
// each one in reverse order and pushing them. I'd love to be able to
// rewrite this to use the values of pri and alt for comparison,
// instead of having to constantly load and reload two variables.
do
{
#emit LOAD.I
#emit PUSH.pri
arg_end -= BYTES_PER_CELL;
#emit LOAD.S.pri arg_end
}
while (arg_end > arg_start);

// Push the static format parameters.
#emit PUSH.S fstring
#emit PUSH.C 128
#emit PUSH.ADR message

// Now push the number of arguments passed to format, including both
// static and variable ones and call the function.
n += BYTES_PER_CELL * 3;
#emit PUSH.S n
#emit SYSREQ.C format

// Remove all data, including the return value, from the stack.
n += BYTES_PER_CELL;
#emit LCTRL 4
#emit LOAD.S.alt n
#emit ADD
#emit SCTRL 4

//return SendClientMessage(playerid, color, message);
return print(message);
}
else
{
//return SendClientMessage(playerid, color, fstring);
return print(fstring);
}
}

public OnFilterScriptInit()
{
CPF(0,0xCF0000FF,"test %s","test");
return 1;
}

smeti
15/02/2010, 06:30 PM
Very nice!

Is this good so?

#define BYTES_PER_CELL 4
// HUGE credit to ZeeX for finally cracking this problem:
// http://forum.sa-mp.com/index.php?topic=79810.msg901721#msg901721
stock CPF(playerid, color, fstring[], {Float, _}:...)
{
// This is the number of parameters which are not variable that are passed
// to this function (i.e. the number of named parameters).
static const
STATIC_ARGS = 3;
// Get the number of variable arguments.
new
n = (numargs() - STATIC_ARGS) * BYTES_PER_CELL;
if(n)
{
new
message[144],
arg_start,
arg_end;

// Load the real address of the last static parameter. Do this by
// loading the address of the last known static parameter and then
// adding the value of [FRM].
#emit CONST.alt fstring
#emit LCTRL 5
#emit ADD
#emit STOR.S.pri arg_start

// Load the address of the last variable parameter. Do this by adding
// the number of variable parameters on the value just loaded.
#emit LOAD.S.alt n
#emit ADD
#emit STOR.S.pri arg_end

// Push the variable arguments. This is done by loading the value of
// each one in reverse order and pushing them. I'd love to be able to
// rewrite this to use the values of pri and alt for comparison,
// instead of having to constantly load and reload two variables.
do
{
#emit LOAD.I
#emit PUSH.pri
arg_end -= BYTES_PER_CELL;
#emit LOAD.S.pri arg_end
}
while(arg_end > arg_start);

// Push the static format parameters.
#emit PUSH.S fstring
#emit PUSH.C 144
#emit PUSH.ADR message

// Now push the number of arguments passed to format, including both
// static and variable ones and call the function.
n += BYTES_PER_CELL * 3;
#emit PUSH.S n
#emit SYSREQ.C format

// Remove all data, including the return value, from the stack.
n += BYTES_PER_CELL;
#emit LCTRL 4
#emit LOAD.S.alt n
#emit ADD
#emit SCTRL 4

if(playerid == INVALID_PLAYER_ID)
{
#pragma unused playerid
return SendClientMessageToAll(color, message);
} else {
return SendClientMessage(playerid, color, message);
}
//return print(message);
} else {
if(playerid == INVALID_PLAYER_ID)
{
#pragma unused playerid
return SendClientMessageToAll(color, fstring);
} else {
return SendClientMessage(playerid, color, fstring);
}
//return print(fstring);
}
}


public OnPlayerConnect(playerid)
{
// INVALID_PLAYER_ID Message to all
CPF(INVALID_PLAYER_ID, 0x00FF00FF, "%s has joined the server.", pName(playerid);
return 1;
}

Y_Less
15/02/2010, 07:50 PM
If it works, then yes. Although you don't need the pragmas - playerid is used in the function, even if it's not on that line.

[HUN]Gamestar
15/02/2010, 09:20 PM
Not working.My translator is frozen off. Application error - refers to a bad memory title.
The code same thing, than it before.

Y_Less
15/02/2010, 09:46 PM
Is it a compiler error or a server error? If it's a server error make sure you have format used elsewhere in your script.

[HUN]Gamestar
15/02/2010, 09:52 PM
Is it a compiler error or a server error? If it's a server error make sure you have format used elsewhere in your script.


"compiler error".Windows Error message...Bad memstack

RyDeR`
15/02/2010, 10:45 PM
Uh excuse me but can I ask what #emit is?

Y_Less
15/02/2010, 11:08 PM
It allows you to output PAWN VM OpCodes directly, instead of having them generated for you by the compiler from typed code. See:

http://www.compuphase.com/pawn/Pawn_Implementer_Guide.pdf

For more information (note: NOT pawn-lang.pdf, this is a more advanced guide).

Dabombber
17/02/2010, 12:25 AM
Would it be possible to use this to fix the bug in CallLocalFunction with empty strings? This would involve looking through the format for string arguments, checking if it was an empty string with getarg, changing it from 's' to 'i' and passing the location of a blank string to the real CallLocalFunction.

Something along the lines of

stock bf_CallLocalFunction(const function[], format[], {Float,_}:...)
{
for(new i; format[i]; i++) {
if(format[i] == 's' && getarg(2 + i) == '\0') {
format[i] = 'i';
}
}
// insert actual code here
return CallLocalFunction(function, format, ...);
}
#if defined _ALS_CallLocalFunction
#undef CallLocalFunction
#else
#define _ALS_CallLocalFunction
#endif
#define CallLocalFunction bf_CallLocalFunction


I know that it's possible to pass empty strings like this, but I'm not sure exactly how it works.

#include <a_samp>

new gString[] = "";

public OnFilterScriptInit()
{
CallLocalFunction("test", "ii", 5, 0); // 0 for gString[0], 4 for gString[1] etc

gString[0] = gString[0] + 0;
return 1;
}

forward test(integer, string[]);
public test(integer, string[])
{
printf("integer: %i\nstring: \"%s\"", integer, string);
}

Y_Less
17/02/2010, 10:48 AM
Firstly, it's not a bug - the PAWN VM can't pass empty strings to scripts, so instead it passes strings that are nearly empty.

Secondly, I don't know - try it and see!

yezizhu
19/02/2010, 01:39 PM
sorry wrong post.

ziomal432
02/05/2010, 03:31 PM
Little typo.

new
gLastTime[MAX_PLAYERS];

#define EXPIRY 1000;

Simon
04/05/2010, 09:53 AM
Little typo.

new
gLastTime[MAX_PLAYERS];

#define EXPIRY 1000;


It's not supposed to have a ';'... a define is simply a search/replace before the code is compiled. Whenever the word EXPIRY is found it is replaced with 1000.


#define EXPIRY 1000

if (variable > EXPIRY) // before replace

if (variable > 1000) // after replace with define above. This is desired.
if (variable > 1000;) // after replace with '#define EXPIRY 1000;'.. not desired. Will cause a syntax error.

Y_Less
04/05/2010, 11:12 AM
No, he was pointing out the fact that there was a semicolon there, which I removed between his and your posts. Sorry Simon and thanks ziomal432.

Stas92
04/05/2010, 11:40 AM
Don't really understand that part: http://forum.sa-mp.com/index.php?topic=79810.0#post_vehicles
My Code looks like that:

stock IsABike(fahrzeug)
{
new Motorads[] = { 581, 521, 463, 522, 461, 471, 468, 586 };
for(new i = 0; i < sizeof(Motorads); i++) {
if(GetVehicleModel(fahrzeug) == Motorads[i]) return 1;
}
return 0;
}

stock ValidVehicle(fahrzeug) {
new Convertibles[4] = {480, 533, 439, 555};
new Industrial[26] = {499, 422, 482, 498, 609, 524, 578, 455, 403, 414, 582, 443, 514, 413, 515, 440, 543, 605, 459, 531, 408, 552, 478, 456, 554};
new LowRider[8] = {536, 575, 534, 567, 535, 566, 576, 412};
new OffRoad[13] = {568, 424, 573, 579, 400, 500, 444, 556, 557, 470, 489, 505, 595};
new Service[19] = {416, 433, 431, 438, 437, 523, 427, 490, 528, 407, 544, 596, 596, 597, 598, 599, 432, 601, 420};
new Saloon[35] = {445, 504, 401, 518, 527, 542, 507, 562, 585, 419, 526, 604, 466, 492, 474, 546, 517, 410, 551, 516, 467, 600, 426, 436, 547, 405, 580, 560, 550, 549, 540, 491, 529, 421};
new Sports[20] = {602, 429, 496, 402, 541, 415, 589, 587, 565, 494, 502, 503, 411, 559, 603, 475, 506, 451, 558, 477};
new Wagons[5] = {418, 404, 479, 458, 561};
new modelid = GetVehicleModel(fahrzeug);
new i;
for(i=0;i<3;i++){
if(Convertibles[i]==modelid) return 1;
}
for(i=0;i<25;i++){
if(Industrial[i]==modelid) return 1;
}
for(i=0;i<7;i++){
if(LowRider[i]==modelid) return 1;
}
for(i=0;i<12;i++){
if(OffRoad[i]==modelid) return 1;
}
for(i=0;i<19;i++){
if(Service[i]==modelid) return 1;
}
for(i=0;i<35;i++){
if(Saloon[i]==modelid) return 1;
}
for(i=0;i<20;i++){
if(Sports[i]==modelid) return 1;
}
for(i=0;i<5;i++){
if(Wagons[i]==modelid) return 1;
}
return 0;
}

Zeex
04/05/2010, 12:15 PM
What exactly did you not understand there? You need firstly make a enum of model flags


enum (<<= 1)
{
MODEL_VALID, // is a model valid (for the ValidVehicle)
MODEL_BIKE,
MODEL_WAGONS,
MODEL_SPORTS,
MODEL_SALOON,
// etc
}


and then you fill the global array of models like in that tutorial


new gModels[] =
{
x,
x,
MODEL_VALID | MODEL_SPORTS,
x,
x,
x,
x,
x,
x,
x
};


so your ValidVehicle() will be as the following:


stock bool:ValidVehicle(fahrzeug)
{
new model = GetVehicleModel(fahrzeug);

if (model != 0)
{
return ((gModels[model - 400] & MODEL_VALID) != 0);
}
return false;
}

Simon
07/05/2010, 05:07 AM
enum (<<= 1)
{
MODEL_VALID, // is a model valid (for the ValidVehicle)
MODEL_BIKE,
MODEL_WAGONS,
MODEL_SPORTS,
MODEL_SALOON,
// etc
}



Maybe just a typo on your part but it's important for this code to work to have "= 1" on the first item (or else you have no bit to shift, enum items start at 0). The correction would be as follows:


enum (<<= 1)
{
MODEL_VALID = 1, // is a model valid (for the ValidVehicle)
MODEL_BIKE,
MODEL_WAGONS,
MODEL_SPORTS,
MODEL_SALOON,
// etc
}

Stas92
07/05/2010, 06:26 AM
Damn, still don't understand. Where do I need to place the modelids? And, how to convert my function to that enum?

Nero_3D
07/05/2010, 04:17 PM
This should be something for you, you only need to add new model labels and use the function to mark the modelids with it


enum (<<= 1)
{
MODEL_VALID = 1,
MODEL_BIKE
// etc
}
new gModels[212];
//AddModelLabel(modelid, MODEL_TYPE | ...)
#define AddModelLabel(%1,%2) (gModels[%1 - 400] |= MODEL_VALID | %2)
//ValidVehicle(modelid)
#define ValidVehicle(%1) (gModels[%1 - 400] & MODEL_VALID)
//IsVehicle(modelid, MODEL_TYPE)
#define IsVehicle(%1,%2) (gModels[%1 - 400] & %2)

//OnGameModeInit
new Motorads[] = { 581, 521, 463, 522, 461, 471, 468, 586 };
for(new i; i < sizeof(Motorads); i++) {
AddModelLabel(Motorads[i], MODEL_BIKE);
}
//etc

//Example
if(ValidVehicle(modelid))
{
if(IsVehicle(modelid, MODEL_BIKE))
{
}
}


The modelid counts as valid if you add any label
These "functions" are only macros, they dont have a modelid check

¤Adas¤
07/05/2010, 04:37 PM
Would it be possible to use this to fix the bug in CallLocalFunction with empty strings? This would involve looking through the format for string arguments, checking if it was an empty string with getarg, changing it from 's' to 'i' and passing the location of a blank string to the real CallLocalFunction.

Something along the lines of

stock bf_CallLocalFunction(const function[], format[], {Float,_}:...)
{
for(new i; format[i]; i++) {
if(format[i] == 's' && getarg(2 + i) == '\0') {
format[i] = 'i';
}
}
// insert actual code here
return CallLocalFunction(function, format, ...);
}
#if defined _ALS_CallLocalFunction
#undef CallLocalFunction
#else
#define _ALS_CallLocalFunction
#endif
#define CallLocalFunction bf_CallLocalFunction


I know that it's possible to pass empty strings like this, but I'm not sure exactly how it works.

#include <a_samp>

new gString[] = "";

public OnFilterScriptInit()
{
CallLocalFunction("test", "ii", 5, 0); // 0 for gString[0], 4 for gString[1] etc

gString[0] = gString[0] + 0;
return 1;
}

forward test(integer, string[]);
public test(integer, string[])
{
printf("integer: %i\nstring: \"%s\"", integer, string);
}



This will not work - CallLocalFunction(function, format, ...);
It will show several errors... :(

ziomal432
18/06/2010, 07:47 PM
So for example:


for (new i = 0; i < MAX_PLAYERS; i++)


Is faster than:


for (new i = 0, j = GetMaxPlayers(); i < j; i++)


As the main part of the loop in the first uses a constant, whereas the main part in the second uses a variable (the overhead of a single function call in a loop is negligible compared to the repeated check). This second version is itself faster than:


for (new i = 0; i < GetMaxPlayers(); i++)


As this third version uses a repeated function call rather than a variable or constant.


#define SpeedTest_Init new time_537336;
#define SpeedTest_LoopStart time_537336 = GetTickCount(); for(new i_537336; i_537336 < 100000; ++i_537336) {
#define SpeedTest_LoopEnd } printf("Time: %d ms", GetTickCount() - time_537336);

Speed results:
SpeedTest_Init
SpeedTest_LoopStart
for(new i; i < MAX_PLAYERS; i++)
{
}
SpeedTest_LoopEnd
SpeedTest_LoopStart
for(new i, c = GetMaxPlayers(); i < c; i++)
{
}
SpeedTest_LoopEnd

500 slots:
[21:42:18] Time: 2728 ms
[21:42:20] Time: 2789 ms

250 slots:
[21:43:25] Time: 2722 ms
[21:43:27] Time: 1418 ms

100 slots:
[21:44:12] Time: 2700 ms
[21:44:13] Time: 596 ms

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

SpeedTest_Init
SpeedTest_LoopStart
for(new i; i < MAX_PLAYERS; i++)
{
if(IsPlayerConnected(i))
{
}
}
SpeedTest_LoopEnd
SpeedTest_LoopStart
for(new i, c = GetMaxPlayers(); i < c; i++)
{
if(IsPlayerConnected(i))
{
}
}
SpeedTest_LoopEnd

500 slots:
[21:46:18] Time: 5792 ms
[21:46:24] Time: 5863 ms

250 slots:
[21:45:49] Time: 5783 ms
[21:45:52] Time: 2958 ms

100 slots:
[21:44:57] Time: 5790 ms
[21:44:58] Time: 1213 ms

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Most of servers has less than 300 slots, so option #2 is better.


EDIT:
Nevermind.

Finn
18/06/2010, 08:32 PM
Most of servers has less than 300 slots, so option #2 is better.

Obviously MAX_PLAYERS has to be re-defined to the correct number, what kind of moron has 500-slot arrays (500 is the default of MAX_PLAYERS) in a 20-slot server?

When MAX_PLAYERS is set correctly, it's clearly faster than the second, GetMaxPlayers, method.

Montis123
18/06/2010, 10:18 PM
Y_Less you a best scripter ;D

Y_Less
21/06/2010, 01:38 PM
Did you compare any of those times to foreach?

Y_Less
29/08/2010, 03:48 PM
I ran some tests on different command processors, including timings and common bugs. The sources are here:

http://y-less.pastebin.com/fk0gx0Qe
http://y-less.pastebin.com/H6pSCuTV
http://y-less.pastebin.com/tQQTaFVa

The commands all have checks in to see if a player can use the command, as y_commands has that built in already. Also the strcmp and strtok versions have the known bug of typing /zzzz getting /zzz. Also mcmd gives a wierd bug when you turn on the printf in mcmd_zzz - it seems to call the wrong function and I'm not sure why. It also doesn't like you doing "/zzz " when using the no parameter version. Also note that this uses a custom version of y_command as because all the commands are so close in name and in alphabetical order, the hashes are in numerical order, which gives worst case performance in quicksort, leading to a crash I need to fix properly at some point. This should not be a problem for normal usage of commands however. Also, I had to remove the IsPlayerConnected return or it would not have called anything.

For 10,000 loops with 53 commands the results are:

Player allowed to use the command:


Results:
ycmd: 293
zcmd: 144
dcmd: 1154
mcmd: 1187
strcmp: 1132
strtok: 1617


Player not allowed to use the command:


Results:
ycmd: 227
zcmd: 143
dcmd: 1829
mcmd: 1188
strcmp: 1129
strtok: 1611


With 677 commands the results are:

Allowed:


Results:
ycmd: 375
zcmd: 157
dcmd: 11444
mcmd: 15334
strcmp: 11154
strtok: 14856


Disallowed:


Results:
ycmd: 275
zcmd: 157
dcmd: 22364
mcmd: 15338
strcmp: 11178
strtok: 14862


Clearly ycmd and zcmd, using the CallRemoteFunction method, have very little overhead regardless of the number of commands in the mode, unlike strcmp based methods which get much slower on average for growing numbers of commands. The results for dcmd were surprising, it shouldn't be that much slower than strcmp. ycmd is slightly slower than zcmd as it has the overhead of command renaming and generic permissions, but is still vastly faster than others.

DiddyBop
29/08/2010, 11:33 PM
I ran some tests on different command processors, including timings and common bugs. The sources are here:

http://y-less.pastebin.com/fk0gx0Qe
http://y-less.pastebin.com/H6pSCuTV
http://y-less.pastebin.com/tQQTaFVa

The commands all have checks in to see if a player can use the command, as y_commands has that built in already. Also the strcmp and strtok versions have the known bug of typing /zzzz getting /zzz. Also mcmd gives a wierd bug when you turn on the printf in mcmd_zzz - it seems to call the wrong function and I'm not sure why. It also doesn't like you doing "/zzz " when using the no parameter version. Also note that this uses a custom version of y_command as because all the commands are so close in name and in alphabetical order, the hashes are in numerical order, which gives worst case performance in quicksort, leading to a crash I need to fix properly at some point. This should not be a problem for normal usage of commands however. Also, I had to remove the IsPlayerConnected return or it would not have called anything.

For 10,000 loops with 53 commands the results are:

Player allowed to use the command:


Results:
ycmd: 293
zcmd: 144
dcmd: 1154
mcmd: 1187
strcmp: 1132
strtok: 1617


Player not allowed to use the command:


Results:
ycmd: 227
zcmd: 143
dcmd: 1829
mcmd: 1188
strcmp: 1129
strtok: 1611


With 677 commands the results are:

Allowed:


Results:
ycmd: 375
zcmd: 157
dcmd: 11444
mcmd: 15334
strcmp: 11154
strtok: 14856


Disallowed:


Results:
ycmd: 275
zcmd: 157
dcmd: 22364
mcmd: 15338
strcmp: 11178
strtok: 14862


Clearly ycmd and zcmd, using the CallRemoteFunction method, have very little overhead regardless of the number of commands in the mode, unlike strcmp based methods which get much slower on average for growing numbers of commands. The results for dcmd were surprising, it shouldn't be that much slower than strcmp. ycmd is slightly slower than zcmd as it has the overhead of command renaming and generic permissions, but is still vastly faster than others.

holy shit zcmd is really, That much faster then dcmd?

Y_Less
29/08/2010, 11:52 PM
Yes, just like everyone has been saying for a long time! I wasn't even really looking to do extensive timings on those two as the facts are already well known, I was looking at ycmd and mcmd as the new ones.

DiddyBop
30/08/2010, 12:06 AM
So a ZCMD, or YCMD /register would be faster then a DCMD /register?

Y_Less
30/08/2010, 01:12 AM
Yes - any command in ycmd or zcmd would be faster than the same command in dcmd, and easier to write.

MrDeath537
30/08/2010, 05:49 AM
So, is mcmd more efficent than dcmd?...

Y_Less
30/08/2010, 09:13 AM
Only in some circumstances, and frankly I don't trust those results as mcmd was behaving VERY wierdly!

RSX
30/08/2010, 12:45 PM
Hello, Y_Less i re-read this article, and for use of all i must say, that I recently was looking around pawn description pdf (I guess i was looking for "new" size) and I found something very useful for some times, especially when it comes of being faster - a LOCAL variable can be CONSTANT, such as
new const MAX_PLYR=GetMaxPlayers();
Another thing is that when "messing" with variables it comes to fact that new can be 64 bit. so 1st 64 bit sys usually may run sa-mp even worse in bad cases, and that bitwise using scripts must(?) be noted as 32 or 64 bit ones.
Thanks for your work in sa-mp. Hopes that i just don't say something you already know and this improves some of your scripts making them best.

One last question - does pawno applies to run block if "if" gets 1=> in result of the check? While trying to learn C#(yess. i now know it's slower and i don't need .Net so no usage for me) i found that C and C++ does run the block if it's 1=> (C# doesn't do that)
This question comes from:
new vehicleid = GetPlayerVehicleID(playerid);
if (vehicleid)
SetVehiclePos(vehicleid, 0.0, 0.0, 10.0);

Scarface~
30/08/2010, 01:03 PM
Wow, this must have taken forever, but there very nice. :)

Y_Less
30/08/2010, 09:26 PM
Hello, Y_Less i re-read this article, and for use of all i must say, that I recently was looking around pawn description pdf (I guess i was looking for "new" size) and I found something very useful for some times, especially when it comes of being faster - a LOCAL variable can be CONSTANT, such as
new const MAX_PLYR=GetMaxPlayers();

GetMaxPlayers is a function call, so is not constant, so you can't do that. Local (and global) constants can only be done using true constants:

new const A = 42;

I should really use them more.

Another thing is that when "messing" with variables it comes to fact that new can be 64 bit. so 1st 64 bit sys usually may run sa-mp even worse in bad cases, and that bitwise using scripts must(?) be noted as 32 or 64 bit ones.

No, PAWN can be run in 64bit, but needs to be configured to do so. SA:MP uses the 32bit version of PAWN only, so all scripts are the same size. The AMX Mod-X compiler actually does generate two versions of the script in the same AMX file, a 32bit and a 64bit version.

Thanks for your work in sa-mp. Hopes that i just don't say something you already know and this improves some of your scripts making them best.

Thanks.

One last question - does pawno applies to run block if "if" gets 1=> in result of the check? While trying to learn C#(yess. i now know it's slower and i don't need .Net so no usage for me) i found that C and C++ does run the block if it's 1=> (C# doesn't do that)
This question comes from:
new vehicleid = GetPlayerVehicleID(playerid);
if (vehicleid)
SetVehiclePos(vehicleid, 0.0, 0.0, 10.0);

Yes, I've often posted scripts like that, in fact I have mentioned using the return from GetPlayerVehicleID as an IsPlayerInAnyvehicle check a few times in the past.

MrDeath537
31/08/2010, 03:23 PM
Only in some circumstances, and frankly I don't trust those results as mcmd was behaving VERY wierdly!

Ok, I'll remove mcmd_init to make it more faster.

RSX
04/09/2010, 11:00 AM
#define CODE_1 printf("%d", 42);
#define CODE_2 new str[4]; format(str, sizeof (str), "%d", 42); print(str);
#define ITERATIONS (100000)

Test()
{
new
t0,
t1,
t2,
i;
t0 = GetTickCount();
for (i = 0; i < ITERATIONS; i++)
{
CODE_1
}
t1 = GetTickCount();
for (i = 0; i < ITERATIONS; i++)
{
CODE_2
}
t2 = GetTickCount();
printf("Time 1: %04d, time 2: %04d", t1 - t0, t2 - t1);
}

My testing report - at 1.000 printf is about 6 times slower, at 10.000 printf is about 200-400ms slower, at 100.000 printf is faster :D, 3.06 GHz processor 2GB of RAM. PS:100.000 is at least wasted minute, don't do that.
Oh and now my server log is 4 Megabytes.

100k Test:
[14:00:45] Time 1: 10362, time 2: 10493

10k Test:
[14:13:33] Time 1: 1307, time 2: 0774
[14:13:33] Time 1: 0991, time 2: 0811
[14:13:33] Time 1: 1426, time 2: 1544

1k Test:
[14:13:01] Time 1: 0642, time 2: 0357
[14:13:39] Time 1: 0238, time 2: 0077
[14:13:33] Time 1: 0243, time 2: 0072 - both change by 5ms.

0.1k Test:
[14:10:13] Time 1: 0164, time 2: 0165
[14:10:13] Time 1: 0064, time 2: 0066
[14:10:13] Time 1: 0068, time 2: 0100

2 Main concepts i got : 1.Making a script which tests some of basic features like this would be useful. 2.As you can see often it makes very great differences, mostly formating a new string is faster, but it makes script more complex, if strings aren't double - Client and print message, then I would recommend just using formated string of course.

Edit: They're almost equally fast, but as I use C i prefer printf.(And cause before pawno i didn't know wtf is print and now that i know i don't really enjoy the concept)
To Y_Less : What OS was that on? Windows XP Service Pack 2

Y_Less
04/09/2010, 11:18 AM
What OS was that on? Unfortunately print statements are hard to time as they make blocking calls to the operating system. I would frankly advise using printf always as you can remove it easilly for production servers.

Y_Less
09/09/2010, 10:48 AM
Added a new small section on copying strings without using format (very very bad).

Simon
09/09/2010, 11:48 AM
Good to see it, I see this practice used everywhere here. Seeing format used for copying strings just seems so off for some reason and slower. I'm surprised there's no native strcpy function!

[L3th4l]
12/09/2010, 03:58 AM
Y_Less, nice one on the strcpy!!!

Slice
26/09/2010, 09:35 PM
I'm working on an include to support byte and short (2 bytes) arrays. I need some help optimizing it!

Right now it takes just over twice as long to set and get in byte arrays compared to normal arrays. What I'm not satisfied with is mainly that I use mod and multiply it by 8 three times in SetByte; I couldn't figure out a better way.

Here's what I got so far:

#if !defined ceildiv
#define ceildiv(%1,%2) (((%1) + (%2) - 1) / (%2)) // Y_Less
#endif

#define byte:%1<%2> %1[ ceildiv( %2, ( cellbits / 8 ) ) ]

#if !defined cellbytes
#define cellbytes \
( cellbits / 8 )
#endif

#define cellshift_byte \
( cellbits / 16 )

#define GetByte(%0,%1) \
( ( ( %0[ ( %1 ) >>> cellshift_byte ] ) >> ( ( ( %1 ) % cellbytes ) * 8 ) ) & 0xFF )

#define SetByte(%0,%1,%2); \
{ %0[ ( %1 ) >>> cellshift_byte ] = ( ( ( %0[ ( %1 ) >>> cellshift_byte ] ) | ( 0xFF << ( ( ( %1 ) % cellbytes ) * 8 ) ) ) ^ ( 0xFF << ( ( ( %1 ) % cellbytes ) * 8 ) ) ) | ( ( %2 ) << ( ( ( %1 ) % cellbytes ) * 8 ) ); }

Y_Less
26/09/2010, 09:52 PM
*8 is the same as <<3, /8 is the same as >>3.

However, the PAWN packed string code actually provides a method for accessing individual bytes efficiently:


// Declare an 8 byte (2 cell) array.
new array[8 char];

// Set the 5th byte.
array{4} = 7;

// Get the 3rd byte:
new var = array{2};


I forgot all about this until I was reminded by looking through the mxini source code.

Slice
26/09/2010, 10:14 PM
Oh.. curly brackets! I forgot that earlier when testing the uchar arrays. Thanks.

hey.. maybe utilizing these arrays would increase the performance in ysi_bit.

Y_Less
26/09/2010, 10:23 PM
I doubt it - the code would end up being exactly the same, just with smaller numbers (which in modern processors makes no difference at all).

VIRUXE
30/09/2010, 05:03 PM
Is it best to declare a bunch of variables like:

new Smth[MAX_PLAYERS], Smth1[MAX_PLAYERS]; And so on...

Or just create an enum with all those variables and use it like SmthData[MAX_PLAYERS][E_SMTH] ?

Y_Less
30/09/2010, 05:19 PM
It depends on what you're doing. From a speed point of view the former is (almost insignificantly) faster, but the later is much nicer. Of course if you can, just make one.

RSX
30/09/2010, 05:19 PM
*8 is the same as <<3, /8 is the same as >>3
I don't even doubt that floatmul isn't as fast as bitwise. As far as I know, bitwise has to be faster than - +, no matter Float or Int.

Btw, if i'm not wrong, someone's trying to make another bool variable type. I hope i'm wrong.

Y_Less
30/09/2010, 05:25 PM
Shifting floats makes no sense. Those are only the same for ints.

RSX
30/09/2010, 05:42 PM
Shifting floats makes no sense. Those are only the same for ints.

I'm talking about the script that he's making, where he needs int calculations and by meaning that it's faster both on Int or Float sum i mean that it's faster, not that it's able to do anything else than shifting bits >> multiplying ints.

Slice
19/10/2010, 01:34 PM
Any suggestions for this code? Right now it takes, by average, twice as long as using format.
I would use an array and pass it through, but I tried keeping things simple for people using it.

http://forum.sa-mp.com/showthread.php?t=184328

Y_Less
19/10/2010, 02:27 PM
Having looked at the code, no, nothing really. Anything I could say is practically insignificant. I will say the code is nothing like how I would have done it, but I don't know if my method would be any better, and I doubt it given how good native functions really are.

Slice
20/10/2010, 05:09 PM
Yeah the natives are fast, indeed. I just tried rewriting it to loop through the string from the right, shifting the numbers and inserting thousand separators while doing so. It was almost half as fast as using strins.

Anyways,
After doing some tests I noticed it doesn't really need optimizations. The results:Bench for FormatNumber( fMyFloat ): executed 214 times in 1 ms.
Bench for FormatNumber( iMyInteger ): executed 199 times in 1 ms.
Bench for FormatNumber( hex:iMyInteger ): executed 395 times in 1 ms.
Bench for FormatNumber( bit:iMyInteger ): executed 329 times in 1 ms.
Bench for FormatNumber( bit_byte:cMyByteArrayy{ 6 } ): executed 382 times in 1 ms.
Bench for FormatNumber( bMyBoolean ): executed 472 times in 1 ms.

This is how I tested it (in case anyone is interested):#include <a_samp>
#include "formatnumber.inc"

#pragma tabsize 0

// These macros will show how many times a piece of code could run in x milliseconds.

// Usage: START_BENCH( runtime ); followed by FINISH_BENCH( name );
// The runtime is in ms, it declares how long the code should be ran.
// The name is just for reference.

#define START_BENCH(%0); \
{ \
new iMilliSeconds = %0, iCount = 0, iTick = GetTickCount( ); \
while ( iTick == GetTickCount( ) ) { } /* Make sure it starts measuring in the beginning of a new ms.*/ \
iTick = GetTickCount( ); \
while ( GetTickCount( ) - iTick < iMilliSeconds ) /* loop until the time has passed */ \
{ \
++iCount;

#define FINISH_BENCH(%0); \
} \
printf( "Bench for " %0 ": executed %d times in %d ms.", iCount, iMilliSeconds ); \
}

public OnFilterScriptInit( )
{
new
Float:fMyFloat = 12678513.1367634,
iMyInteger = 5342714879,
bool:bMyBoolean = false,
cMyByteArray[ 10 char ]
;

START_BENCH( 1 );

FormatNumber( fMyFloat );

FINISH_BENCH( "FormatNumber( fMyFloat )" );

// ------------------------------------------

START_BENCH( 1 );

FormatNumber( iMyInteger );

FINISH_BENCH( "FormatNumber( iMyInteger )" );

// ------------------------------------------

START_BENCH( 1 );

FormatNumber( hex:iMyInteger );

FINISH_BENCH( "FormatNumber( hex:iMyInteger )" );

// ------------------------------------------

START_BENCH( 1 );

FormatNumber( bit:iMyInteger );

FINISH_BENCH( "FormatNumber( bit:iMyInteger )" );

// ------------------------------------------

START_BENCH( 1 );

FormatNumber( bit_byte:cMyByteArray{ 6 } );

FINISH_BENCH( "FormatNumber( bit_byte:cMyByteArrayy{ 6 } )" );

// ------------------------------------------

START_BENCH( 1 );

FormatNumber( bMyBoolean );

FINISH_BENCH( "FormatNumber( bMyBoolean )" );
}

dUDALUS
18/11/2010, 04:03 PM
Hello

After 6 houers exam I came home and read your post, go down the stair and print the 21 pages ;)
Great and longest post that I ever seen , you`ve made me sleepless for a long time. Great job and great support for the GTA-SAMP community

Joe_
26/11/2010, 01:00 PM
I have tried reading and testing myself the info in yours (yless) and kyosaur's binary/bit post, and I can never get it to work, I am beginning to grasp Binary (I think..) but I Don't know a phew things:

I tested this as in your example:



#define IsBike(%0) \
(gModels[(%0) - 400] & VEHICLE_BIKE)

#define IsBoat(%0) \
(gModels[(%0) - 400] & VEHICLE_BOAT)

// etc..

new gModels[] = { VEHICLE_BIKE, VEHICLE_CAR, VEHICLE_BOAT | VEHICLE_POLICE };

if(IsBike(pModelid))
{
// Do something
}
if(IsCar(pModelid))
{
// Do something
}


But whatever vehicle I get in, it says it's a bike.

pModelid is the modelid (GetVehicleModel) and is returning the right modelid (I'm using an admiral, Model 445)

Which should be index 45 (445-400)

I have got 211 entries (as per the amount of vehicles in GTA) all have their own attributes, from 400 (landstalker) to 611 (I think that's the S.W.A.T Van)

Some have 3 bits set, I don't know if I'm doing it right, I am using:


VEHICLE_SOMETHING | VEHICLE_SOMETHINGELSE | VEHICLE_ANDANOTHERONE,
//Then more


Does that turn on all 3 bits or just the first one?

SOMETHING is BIT 1, SOMETHINGELSE is BIT2, ANOTHERONE is BIT3, so would that, in binary be:

01111011
or
0b01111011

?

I really need some help, these are very intresting :/

Y_Less
26/11/2010, 01:44 PM
Can I see your enum for those? If bits 0, 1 and 2 are set the result should be 7 (0b111) - don't think in terms of numbers - each number is the bit that should be 1.

Slice
26/11/2010, 01:48 PM
enum( <<= 1 )
{
BIT_1 = 1,
BIT_2,
BIT_3,
BIT_4,
BIT_5,
BIT_6,
BIT_7,
BIT_8
};

This is how you would correctly enumerate bit flags (maybe you did; I just thougth I'd mention it anyways).

Joe_
26/11/2010, 02:04 PM
I've pasted the whole code. Maybe this can help some others, too (it took a while to do this lol!)

Binary / Bit manipulation is totally new to me!

http://pastebin.com/xLqknjAB

Slice
26/11/2010, 02:13 PM
Correct would be:enum( <<= 1 )
{
VEHICLE_AIRPLANE = 1,
VEHICLE_HELICOPTER,
VEHICLE_BIKE,
VEHICLE_CONVERTIBLE,
VEHICLE_INDUSTRIAL,
VEHICLE_LOWRIDER,
VEHICLE_OFFROAD,
VEHICLE_EMERGENCY,
VEHICLE_SALOON,
VEHICLE_SPORT,
VEHICLE_WAGON,
VEHICLE_BOAT,
VEHICLE_TRAILER,
VEHICLE_UNIQUE,
VEHICLE_RC,
VEHICLE_POLICE,
VEHICLE_PUBLIC,
VEHICLE_MEDICAL,
VEHICLE_FIRE,
VEHICLE_TRUCK,
VEHICLE_VAN,
VEHICLE_LORRY,
VEHICLE_FBI,
VEHICLE_MILITARY,
VEHICLE_ARMED,
VEHICLE_BROKEN,
VEHICLE_FARM,
VEHICLE_AIRPORT,
VEHICLE_NEWS,
VEHICLE_LAND,
VEHICLE_AIR
};


This part: enum( <<= 1 )
will tell the compiler to shift the value of the enumerated numbers by one bit each entry.
So it would look something like this:
#define VEHICLE_AIRPLANE (0b00000001)
#define VEHICLE_HELICOPTER (0b00000010)
#define VEHICLE_BIKE (0b00000100)
#define VEHICLE_CONVERTIBLE (0b00001000)
#define VEHICLE_INDUSTRIAL (0b00010000)
#define VEHICLE_LOWRIDER (0b00100000)
#define VEHICLE_OFFROAD (0b01000000)
#define VEHICLE_EMERGENCY (0b10000000)

// etc...

Joe_
26/11/2010, 02:13 PM
Ah, I read something about that. I will try it, thanks.

Ok I think it works, I will try again just incase.

So can you give attributes to just about anything? (skins, things like that.. maybe weapons)

WEAPON_HEAVY, WEAPON_RIFLE, etc?

Y_Less
26/11/2010, 02:52 PM
Yes, very easily. The vehicles code was just an example.

Slice
27/11/2010, 08:35 PM
printf is about twice as fast as print for plain strings; why is that? Shouldn't it be the other way?

Also, for plain strings, it seems SendRconCommand( "echo string" ) is equally fast compared to printf( "hello" ).

Y_Less
28/11/2010, 05:21 PM
No idea, I suspect there may be minor differences, but it depends on how things are implemented and which versions of logprintf are called if any.

Slice
01/12/2010, 11:14 PM
One way to significantly improve performance with strings passed into variadic functions would be to load the address into a variable and use that; however, is this possible? I'm thinking it shouldn't be too much hassle for anyone somewhat good at PAWN asm.

Y_Less
02/12/2010, 01:00 AM
I think if you just do getarg without an index it might return the address.

EDR Clan
14/04/2011, 04:48 PM
WOOW MAN, Its in Chinese?

Y_Less
14/04/2011, 06:18 PM
No, but if you want to translate it feel free.

MJ!
30/04/2011, 11:17 AM
Why the next 3 topics are reserved :( ?

Y_Less
30/04/2011, 12:14 PM
Because I originally wrote this topic when the SA:MP forums were on the old software, at which time there was a 20,000 character limit to posts. When the forum ported to this new software the character restriction was lifted and through numerous later edits the posts merged in to one, but the extra posts from were the topic had continued still existed, only empty.

MJ!
01/05/2011, 07:45 PM
Oh, i understand now. I though you reserve them for "money" ^^. Thanks for answer.
And thanks for topic.

leong124
23/05/2011, 06:53 AM
I've heard of something like this somewhere:

if(a && b)

has the same speed as:

if(a)
if(b)

I made a benchmark for that and I find they don't have the same speed.
So this is my first code to test when "a" and "b" are both true, I used strcmp to make the running time longer:

#define FILTERSCRIPT

#include <a_samp>

#pragma tabsize 0
//Credits goes to Slice
#define START_BENCH(%0); {new __a=%0,__b=0,__c,__d=GetTickCount(),__e=1;do{}\
while(__d==GetTickCount());__c=GetTickCount();__d= __c;while(__c-__d<__a||\
__e){if(__e){if(__c-__d>=__a){__e=0;__c=GetTickCount();do{}while(__c==\
GetTickCount());__c=GetTickCount();__d=__c;__b=0;} }{

#define FINISH_BENCH(%0); }__b++;__c=GetTickCount();}printf(" Bench for "\
%0": executes, by average, %.2f times/ms.",floatdiv(__b,__a));}

public OnFilterScriptInit()
{
START_BENCH(10000);//First case
if(!strcmp("54fghfghfg","54fghfghfg",false) && !strcmp("54fghfghfg","54fghfghfg",false)) {}
FINISH_BENCH("if(a && b)");
START_BENCH(10000);//Second case
if(!strcmp("54fghfghfg","54fghfghfg",false))
if(!strcmp("54fghfghfg","54fghfghfg",false)) {}
FINISH_BENCH("if(a)if(b)");
return 1;
}


Console input: loadfs test
[2011-05-23 14:39:14] Bench for if(a && b): executes, by average, 531.31 times/ms.
[2011-05-23 14:39:34] Bench for if(a)if(b): executes, by average, 537.95 times/ms.
[2011-05-23 14:39:34] Filterscript 'test.amx' loaded.

So the second case is slightly faster, but it's neglectable.

Now when "a" is false while "b" is true:

#define FILTERSCRIPT

#include <a_samp>

#pragma tabsize 0
//Credits goes to Slice
#define START_BENCH(%0); {new __a=%0,__b=0,__c,__d=GetTickCount(),__e=1;do{}\
while(__d==GetTickCount());__c=GetTickCount();__d= __c;while(__c-__d<__a||\
__e){if(__e){if(__c-__d>=__a){__e=0;__c=GetTickCount();do{}while(__c==\
GetTickCount());__c=GetTickCount();__d=__c;__b=0;} }{

#define FINISH_BENCH(%0); }__b++;__c=GetTickCount();}printf(" Bench for "\
%0": executes, by average, %.2f times/ms.",floatdiv(__b,__a));}

public OnFilterScriptInit()
{
START_BENCH(10000);//First case
if(!strcmp("54fghfghfg","a54fghfghfg",false) && !strcmp("54fghfghfg","54fghfghfg",false)) {}
FINISH_BENCH("if(a && b)");
START_BENCH(10000);//Second case
if(!strcmp("54fghfghfg","a54fghfghfg",false))
if(!strcmp("54fghfghfg","54fghfghfg",false)) {}
FINISH_BENCH("if(a)if(b)");
return 1;
}


Console input: reloadfs test
[2011-05-23 14:40:02] Filter script 'test.amx' unloaded.
[2011-05-23 14:40:22] Bench for if(a && b): executes, by average, 643.18 times/ms.
[2011-05-23 14:40:42] Bench for if(a)if(b): executes, by average, 697.59 times/ms.
[2011-05-23 14:40:42] Filterscript 'test.amx' loaded.

Second case is faster than the first one by around 50 times per millisecond.

Therefore I guess using

if(a)//Faster check, for example string hash checking
if(b)//Slower check, for example strcmp

will get more speed.


Another thing, can automata be used cross scripts?
If it can then automata may be possible to hook the functions for libraries, and it would be faster than some other methods.

It should work, but I can't get the definition work.
Something like this:

testhook.inc:

#include <a_samp>

new bool:Loaded = false;

public OnFilterScriptInit() <>//load
{
if(!Loaded)
{
print("Library");
Loaded = true;
state OFSIState:_0;
OnFilterScriptInit();
}
return 1;
}

//#define OnFilterScriptInit() OnFilterScriptInit() <OFSIState:_0> //This definition crashes my compiler

test.pwn

#include <a_samp>
#include <testhook>

public OnFilterScriptInit() //<OFSIState:_0> (With this but without the definition it works)
{
print("hello world!");
return 1;
}

Y_Less
23/05/2011, 09:52 AM
That is an interesting idea for hooking callbacks, but it won't work at all if there is more than one library using it, which is one of the requirements for hooking methods.

As for the if checks, people really need to start looking at where the true bottlenecks in their code are, not what tiny little things can be improved all over the place. I am massively aware that it is largely my fault so I feel I need to put a stop to it too.

Omega-300
29/06/2011, 03:14 PM
I've heard of something like this somewhere:

if(a && b)

has the same speed as:

if(a)
if(b)

I made a benchmark for that and I find they don't have the same speed.
So this is my first code to test when "a" and "b" are both true, I used strcmp to make the running time longer:

#define FILTERSCRIPT

#include <a_samp>

#pragma tabsize 0
//Credits goes to Slice
#define START_BENCH(%0); {new __a=%0,__b=0,__c,__d=GetTickCount(),__e=1;do{}\
while(__d==GetTickCount());__c=GetTickCount();__d= __c;while(__c-__d<__a||\
__e){if(__e){if(__c-__d>=__a){__e=0;__c=GetTickCount();do{}while(__c==\
GetTickCount());__c=GetTickCount();__d=__c;__b=0;} }{

#define FINISH_BENCH(%0); }__b++;__c=GetTickCount();}printf(" Bench for "\
%0": executes, by average, %.2f times/ms.",floatdiv(__b,__a));}

public OnFilterScriptInit()
{
START_BENCH(10000);//First case
if(!strcmp("54fghfghfg","54fghfghfg",false) && !strcmp("54fghfghfg","54fghfghfg",false)) {}
FINISH_BENCH("if(a && b)");
START_BENCH(10000);//Second case
if(!strcmp("54fghfghfg","54fghfghfg",false))
if(!strcmp("54fghfghfg","54fghfghfg",false)) {}
FINISH_BENCH("if(a)if(b)");
return 1;
}


Console input: loadfs test
[2011-05-23 14:39:14] Bench for if(a && b): executes, by average, 531.31 times/ms.
[2011-05-23 14:39:34] Bench for if(a)if(b): executes, by average, 537.95 times/ms.
[2011-05-23 14:39:34] Filterscript 'test.amx' loaded.

So the second case is slightly faster, but it's neglectable.

Now when "a" is false while "b" is true:

#define FILTERSCRIPT

#include <a_samp>

#pragma tabsize 0
//Credits goes to Slice
#define START_BENCH(%0); {new __a=%0,__b=0,__c,__d=GetTickCount(),__e=1;do{}\
while(__d==GetTickCount());__c=GetTickCount();__d= __c;while(__c-__d<__a||\
__e){if(__e){if(__c-__d>=__a){__e=0;__c=GetTickCount();do{}while(__c==\
GetTickCount());__c=GetTickCount();__d=__c;__b=0;} }{

#define FINISH_BENCH(%0); }__b++;__c=GetTickCount();}printf(" Bench for "\
%0": executes, by average, %.2f times/ms.",floatdiv(__b,__a));}

public OnFilterScriptInit()
{
START_BENCH(10000);//First case
if(!strcmp("54fghfghfg","a54fghfghfg",false) && !strcmp("54fghfghfg","54fghfghfg",false)) {}
FINISH_BENCH("if(a && b)");
START_BENCH(10000);//Second case
if(!strcmp("54fghfghfg","a54fghfghfg",false))
if(!strcmp("54fghfghfg","54fghfghfg",false)) {}
FINISH_BENCH("if(a)if(b)");
return 1;
}


Console input: reloadfs test
[2011-05-23 14:40:02] Filter script 'test.amx' unloaded.
[2011-05-23 14:40:22] Bench for if(a && b): executes, by average, 643.18 times/ms.
[2011-05-23 14:40:42] Bench for if(a)if(b): executes, by average, 697.59 times/ms.
[2011-05-23 14:40:42] Filterscript 'test.amx' loaded.

Second case is faster than the first one by around 50 times per millisecond.

Therefore I guess using

if(a)//Faster check, for example string hash checking
if(b)//Slower check, for example strcmp

will get more speed.



The
if(a && b)
and
if(a) if(b)
are actually same in AMX assembly, thus the time difference is in fact probably context switching of CPU and things like that and not the Pawn.

Metallica502
15/08/2011, 01:41 PM
guys Y_less was a former samp devloper if you even bother to look in the samp credits u would know this

Y_Less
15/08/2011, 02:45 PM
guys Y_less was a former samp devloper if you even bother to look in the samp credits u would know this

Most people know that but I'm not sure why it's relevant to this topic. I should point out that most of the things mentioned in this topic are to do with the core PAWN language, not the SA:MP API. PAWN is a third-party library (which, BTW, is open source if anyone cares to do some digging - you can find some interesting things in there), this means that the SA:MP developers had nothing to do with the development of PAWN and so in theory have no advantage in it over anyone else. Everything I've listed here comes from reading, experimenting and learning (not just PAWN).

Mr.Professional
14/11/2011, 02:21 PM
Man, why you're Pro? xD

Unknown1234
26/01/2012, 08:05 AM
Because he is one of the legends of scripter :) i see y_Less is pawn creator

MP2
26/01/2012, 08:57 AM
I believe he studied computer science (or a similar subject) at college/uni.

Y_Less
26/01/2012, 11:38 AM
Electronic and Software Engineering, but that's not really why - the reason I took that course is the same reason I can write this - I read a lot on this.

The King's Bastard
27/01/2012, 07:51 PM
I've made a little test:

r = ((r * x) * y) * z; VS r = r * ((x * y) * z);

#include <a_samp>

main()
{
const ITERS = 10000;

new
t,
i,
a[1024];

Loop:
for (i = 0; i < sizeof a; i++)
a[i] = random(1024);

t = GetTickCount();
for (i = 0; i < ITERS; i++)
aprod1(a, sizeof a);
printf("Time 1: %d ms", GetTickCount() - t);

t = GetTickCount();
for (i = 0; i < ITERS; i++)
aprod2(a, sizeof a);
printf("Time 2: %d ms", GetTickCount() - t);

t = GetTickCount();
for (i = 0; i < ITERS; i++)
aprod3(a, sizeof a);
printf("Time 3: %d ms", GetTickCount() - t);

t = GetTickCount();
for (i = 0; i < ITERS; i++)
aprod4(a, sizeof a);
printf("Time 4: %d ms", GetTickCount() - t);

t = GetTickCount();
for (i = 0; i < ITERS; i++)
aprod5(a, sizeof a);
printf("Time 5: %d ms", GetTickCount() - t);
goto Loop;
}

aprod1(a[], n)
{
new
i,
x,
y,
z,
r = 1;

for (i = 0; i < n - 2; i += 3)
{
x = a[i];
y = a[i + 1];
z = a[i + 2];
r = ((r * x) * y) * z;
}

for ( ; i < n; i++)
r *= a[i];

return r;
}

aprod2(a[], n)
{
new
i,
x,
y,
z,
r = 1;

for (i = 0; i < n - 2; i += 3)
{
x = a[i];
y = a[i + 1];
z = a[i + 2];
r = r * ((x * y) * z);
}

for ( ; i < n; i++)
r *= a[i];

return r;
}

aprod3(a[], n)
{
new
i,
r = 1;

for (i = 0; i < n; i++)
r *= a[i];

return r;
}

aprod4(a[], n)
{
new
i,
x,
y,
z,
r = 1;

for (i = -1; i < n - 3; )
{
x = a[++i];
y = a[++i];
z = a[++i];
r = r * ((x * y) * z);
}

for ( ; i < n; i++)
r *= a[i];

return r;
}

aprod5(a[], n)
{
new
i,
x,
y,
z,
r = 1;

for (i = 0; i < n - 2; )
{
x = a[i++];
y = a[i++];
z = a[i++];
r = r * ((x * y) * z);
}

for ( ; i < n; i++)
r *= a[i];

return r;
}


Results:
[21:17:20] Time 1: 2762 ms
[21:17:22] Time 2: 2677 ms
[21:17:25] Time 3: 2596 ms
[21:17:27] Time 4: 1992 ms
[21:17:29] Time 5: 1942 ms
[21:17:32] Time 1: 2736 ms
[21:17:34] Time 2: 2695 ms
[21:17:37] Time 3: 2587 ms
[21:17:39] Time 4: 2054 ms
[21:17:41] Time 5: 1823 ms
[21:17:43] Time 1: 2648 ms
[21:17:46] Time 2: 2580 ms
[21:17:49] Time 3: 2534 ms
[21:17:50] Time 4: 1910 ms
[21:17:52] Time 5: 1819 ms
[21:17:55] Time 1: 2641 ms
[21:17:58] Time 2: 2588 ms
[21:18:00] Time 3: 2536 ms
[21:18:02] Time 4: 1922 ms
[21:18:04] Time 5: 1822 ms

Done on an Intel Core i5-2520M.

There is a significant improvement between third (aprod3) (which probably everyone would naïvely use) and the foruth and fifth (aprod4 & aprod5).
The difference between 2 and 5 is also remarkable.

varuncoolrule
27/01/2012, 07:53 PM
Awesome.
Y_Less how you type all i got mad even i can't read it

Y_Less
27/01/2012, 11:01 PM
I've made a little test:


There is a significant improvement between third (aprod3) (which probably everyone would naïvely use) and the foruth and fifth (aprod4 & aprod5).
The difference between 2 and 5 is also remarkable.

That's not really a significant sample size.

I've said this before - I'm frankly sad I ever wrote this topic, people are concentrating WAY too much on the little things when they don't even know where the slow parts of their code are. Yes, you can spend days making the product of an array faster, but who cares if another part of your mode is called much more frequently and is much slower? Most of the time improving slow code (once you KNOW it is slow) is a matter of using a different algorithm, not changing from "a = a + 1" to "++a".

Slice
27/04/2012, 08:31 AM
Worth mentioning is putting using numbers without decimals in functions that ask for floats will essentially wrap these numbers in a function call, even though they're constant.

Bottom line: Always put decimals on your float values, even it it's just .0!

http://slice-vps.nl/ppg/#gist=dfae5c1da66cc3d14e8d

sampreader
15/05/2012, 03:45 PM
under returning values

(Note the double brackets to avoid the unintended assignment warning)

Then a would be assigned to b, so be would be 1, and that 1 would still be active in the if effectively as the return of the assignment, so this if is true, however:

Shouldn't that be just "b"
(sry for bump)

TzAkS.
15/05/2012, 03:56 PM
Realy nice,what will do the sa:mp world without you :))

Y_Less
15/05/2012, 03:57 PM
Don't apologise for bumping old topics - it's encouraged if you have something to add or want an answer to an old topic. And as you have found a mistake, that is a contribution, so thanks!

Also, I'm barely sure what that sentence means (though I am reading it out of context), but I've been planning to edit this anyway because some bits are a little hard to read.

Speaking of which, if anyone wants to proof-read the first post and find as many mistakes and confusing parts as possible (I know my writing style is a little tricky to read sometimes) please do and you will be credited! I need it doing at the moment (Slice/JaTochNietDan, if you see this post - you know why).

Jonny5
14/06/2012, 04:57 AM
ill give it a reread. i could use it!

anyways i got a question will this
plain string

print(\"C:\all my work\novel.rtf");


use the same space as this string

print("C:\\all my work\\novel.rtf");


i guess what im asking is does the escape sequences chars take more space?
Im thinking they are in fact the same but the compiler adds the extra \ in the string where needed or the likes.
I never really seen anyone use plain strings but seen it in the pawn ref and
just out of curiosity wanted to know if one is better than the other where applicable.

Slice
14/06/2012, 06:13 AM
No, both of those will end up with the same string size. Escaping is only for the pre-processor. A backslash in a string usually tells the compiler to treat the next character differently (ex. \n or \"), but the backslash isn't included in the compiled AMX.

YourGTA
10/07/2012, 04:53 PM
Wow, a lot of content there. It's all very useful and I will refer back to this - thank you very much

M3mPHi$_S3
10/07/2012, 04:59 PM
woow that's it Y_Less best release again... and too long

Dripac
10/07/2012, 07:41 PM
Soo much text, impossible to read and understand in less than 2 days

Igi_Guduric
13/07/2012, 07:06 AM
Good job man.

10/10

Xentiarox
02/08/2012, 11:38 PM
Well I have done a benchmark with this code:

http://pastebin.com/dL7RWCqc

and this are the result:


[01:43:44] 192 ms = 0.00001920000067912042 ms / empty loop
[01:43:44] 160 ms = 0.00001599999995960388 ms / empty loop
[01:43:44] 288 ms = 0.00002879999919969122 ms / new VAR loop
[01:43:45] 256 ms = 0.00002560000029916409 ms / new VAR loop
[01:43:45] 480 ms = 0.00004800000169780105 ms / new VAR = i loop
[01:43:46] 480 ms = 0.00004800000169780105 ms / new VAR = i loop
[01:43:48] 2816 ms = 0.00028159999055787920 ms / new VAR[256] loop
[01:43:51] 2848 ms = 0.00028479998582042753 ms / new VAR[256] loop
[01:43:54] 3072 ms = 0.00030720001086592674 ms / new VAR[256] = "Hello World!" loop
[01:43:57] 3104 ms = 0.00031040000612847506 ms / new VAR[256] = "Hello World!" loop
[01:44:01] 3616 ms = 0.00036159998853690922 ms / new VAR[256];VAR = cSTR loop
[01:44:05] 3616 ms = 0.00036159998853690922 ms / new VAR[256];VAR = cSTR loop
[01:44:05] 192 ms = 0.00001920000067912042 ms / static sVAR loop
[01:44:05] 224 ms = 0.00002239999957964755 ms / static sVAR loop
[01:44:05] 256 ms = 0.00002560000029916409 ms / static sVAR = i loop
[01:44:06] 288 ms = 0.00002879999919969122 ms / static sVAR = i loop
[01:44:06] 256 ms = 0.00002560000029916409 ms / gVAR loop
[01:44:06] 224 ms = 0.00002239999957964755 ms / gVAR loop
[01:44:06] 256 ms = 0.00002560000029916409 ms / gVAR = i loop
[01:44:07] 256 ms = 0.00002560000029916409 ms / gVAR = i loop
[01:44:07] 224 ms = 0.00002239999957964755 ms / gSTR loop
[01:44:07] 256 ms = 0.00002560000029916409 ms / gSTR loop
[01:44:08] 448 ms = 0.00004479999915929511 ms / gSTR = "Hello World!" loop
[01:44:08] 448 ms = 0.00004479999915929511 ms / gSTR = "Hello World!" loop
[01:44:09] 928 ms = 0.00009280000085709616 ms / gSTR = cSTR loop
[01:44:10] 896 ms = 0.00008959999831859022 ms / gSTR = cSTR loop


So, what do you think, what have you learned, and what are your conclusions?

Well if I look at the results it seems that the fastest loop WITH something in it is actually the
static sVAR loop
So I think I am going to define static variables in loops where I assign values each iteration.

And as for assigning strings, well no way I'm going to use new VAR[256];VAR = cSTR which is almost 2 seconds more compared to gSTR = cSTR

And I have learned what the speed differences are between using global/local/static variables and when and how you use them.

Now it's your turn ;)

P.S: ofcourse the speed differences are minimal, but in comparision terms we use "times faster and times slower" so don't say "it's just 0.0001 ms" !

Xentiarox
03/08/2012, 12:21 AM
http://www.youtube.com/watch?v=nogqdsRR5dA

I'm sorry, but you're optimizing the wrong things. We're talking about nanoseconds.

This is not a optimization topic -.- just variable usage discussion, damnit, please read everything, WHOLE POSTS PLZ.