haiku-website/static/legacy-docs/benewsletter/Issue4-25.html

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Be Newsletters - Volume 4: 1999</title><link rel="stylesheet" href="be_newsletter.css" type="text/css" media="all" /><link rel="shortcut icon" type="image/vnd.microsoft.icon" href="./images/favicon.ico" /><!--[if IE]>
    <link rel="stylesheet" type="text/css" href="be_newsletter_ie.css" />
    <![endif]--><meta name="generator" content="DocBook XSL Stylesheets V1.73.2" /><link rel="start" href="index.html" title="Be Newsletters" /><link rel="up" href="volume4.html" title="Volume 4: 1999" /><link rel="prev" href="Issue4-24.html" title="Issue 4-24, June 16, 1999" /><link rel="next" href="Issue4-26.html" title="Issue 4-26, June 30, 1999" /></head><body><div id="header"><div id="headerT"><div id="headerTL"><a accesskey="p" href="Issue4-24.html" title="Issue 4-24, June 16, 1999"><img src="./images/navigation/prev.png" alt="Prev" /></a> <a accesskey="u" href="volume4.html" title="Volume 4: 1999"><img src="./images/navigation/up.png" alt="Up" /></a> <a accesskey="n" href="Issue4-26.html" title="Issue 4-26, June 30, 1999"><img src="./images/navigation/next.png" alt="Next" /></a></div><div id="headerTR"><div id="navigpeople"><a href="http://www.haiku-os.org"><img src="./images/People_24.png" alt="haiku-os.org" title="Visit The Haiku Website" /></a></div><div class="navighome" title="Home"><a accesskey="h" href="index.html"><img src="./images/navigation/home.png" alt="Home" /></a></div><div class="navigboxed" id="naviglang" title="English">en</div></div><div id="headerTC">Be Newsletters - Volume 4: 1999</div></div><div id="headerB">Prev: <a href="Issue4-24.html">Issue 4-24, June 16, 1999</a>  Up: <a href="volume4.html">Volume 4: 1999</a>  Next: <a href="Issue4-26.html">Issue 4-26, June 30, 1999</a></div><hr /></div><div class="article"><div xmlns="" xmlns:d="http://docbook.org/ns/docbook" class="titlepage"><div><div xmlns:d="http://docbook.org/ns/docbook"><h2 xmlns="http://www.w3.org/1999/xhtml" class="title"><a id="Issue4-25"></a>Issue 4-25, June 23, 1999</h2></div></div></div><div class="sect1"><div xmlns="" xmlns:d="http://docbook.org/ns/docbook" class="titlepage"><div><div xmlns:d="http://docbook.org/ns/docbook"><h2 xmlns="http://www.w3.org/1999/xhtml" class="title"><a id="Engineering4-25"></a>Be Engineering Insights: AreaWatch, or Smaller Is Always Better...</h2></div><div xmlns:d="http://docbook.org/ns/docbook"><span xmlns="http://www.w3.org/1999/xhtml" class="author">By <span class="firstname">Pierre</span> <span class="surname">Raynaud</span></span></div></div></div><p>
If you look at the BeOS R4 datasheet
&lt;http://www-classic.be.com/products/beos/beos_datasheet.html&gt; you'll find
many bold keywords that describe the best features of our modern OS --
virtual memory being one of them. On one hand, virtual memory is
certainly a great feature, as it makes involuntary cross-application
memory corruption impossible, and lets you virtually increase the amount
of usable memory by putting extra data on a hard drive, in what's called
a "swap file".
</p><p>
From my point of view, though, this great feature has one big drawback:
it makes a typical careless developer think that the memory available is
almost limitless.... Why should he care about optimizing memory usage,
since modern computers ship with more and more memory, and there will
always be still more virtual memory available on a multi-gigabyte hard
drive?
</p><div class="sect2"><div xmlns="" xmlns:d="http://docbook.org/ns/docbook" class="titlepage"><div><div xmlns:d="http://docbook.org/ns/docbook"><h3 xmlns="http://www.w3.org/1999/xhtml" class="title"><a id="id792200"></a>So, Why?</h3></div></div></div><p>
By not caring about memory usage, a developer buys his application a
one-way, first class ticket to performance hell. First, virtual memory
may be huge, but it's certainly not fast. At one end, regular RAM
supports bandwidth up to x100MB/s and responds to a random request in a
fraction of microsecond. At the other end, an average hard drive won't
even reach 10MB/s, and worse yet, it will need many milliseconds to reply
to a random request. In practice, once you start "swapping" actively,
it's common to see average performance become 10 or even 100 times
slower. For a short operation, the performance impact can even reach
ratios as high as 1,000 or 10,000 times slower. You don't want that to
happen to your application, do you?
</p><p>
But even if you don't reach such extreme—sadly not so uncommon --
slow-downs, using more memory is still a bad idea. Modern processors have
become so fast during the last 10 years that RAM technology hasn't been
able to keep up the pace. That's why all modern processors don't access
the main memory directly, but go through multiple levels of memory
caches. The L1 (or level 1 cache) is the closest to the processor, and
also the fastest, but because it's extremely expensive, it's usually
rather small (typically 32K or 64K). Beyond that, most processors have an
L2 (level 2 cache), slower than the L1 but faster than the main memory.
L2 is still expensive, but not as much as L1; its typical size is around
256K - 1MB. In real life, both of those caches are far to small to
contain all the data commonly used by a program, so your machine will
spend a big chunk of its time exchanging data between L1, L2, and main
memory. And this is often where your application's performance bottleneck
will be.
</p><p>
So if you want some simple advice, try to keep the memory footprint of
your application as small as possible. Here are a few very simple things
you can do to achieve that:
</p><ul class="itemizedlist"><li><p>
When allocating integer arrays, pick the smaller integer type that
fits your need. For example, if you're going to store values between 0
and 199, don't just use an <span class="type">int</span> or <span class="type">int32</span>
by default. A <span class="type">uint8</span> will
do nicely, and will use four times less memory.
</p></li><li><p>
When you define structs and classes that could be allocated in huge
numbers, try to chose the integer types in a similarly efficient way,
and order them sensibly. A simple way to do that is to always put the
longest types first. For example, compare these 2 structs:
</p><pre class="programlisting c">
struct <span class="type">_poor_order_</span> {
  <span class="type">int8</span>      <code class="varname">a8</code>;
  <span class="type">double</span>    <code class="varname">d64</code>;
  <span class="type">int8</span>      <code class="varname">b8</code>;
  <span class="type">int32</span>     <code class="varname">a32</code>;
  <span class="type">int16</span>     <code class="varname">a16</code>;
};

struct <span class="type">_good_order_</span> {
  <span class="type">double</span>    <code class="varname">d64</code>;
  <span class="type">int32</span>     <code class="varname">a32</code>;
  <span class="type">int16</span>     <code class="varname">a16</code>;
  <span class="type">int8</span>      <code class="varname">a8</code>;
  <span class="type">int8</span>      <code class="varname">b8</code>;
 };
</pre><p>
<span class="application">gcc</span> will tell you that
</p><pre class="screen">
<code class="function">sizeof</code>(_poor_order_) == 24; <code class="function">sizeof</code>(_good_order_) == 16;
</pre><p>
Guess which one you want to use...?
</p></li><li><p>
When you need large buffers, think twice before taking the laziest
path, which is to allocate those buffers as much space as first seems
to be necessary. Just take a few minutes to ask yourself these simple
questions:
</p><ul class="itemizedlist"><li><p>
Do you really use all those bits of information?
</p><p>
For example, the following <span class="type">bool</span> array:
</p><pre class="programlisting c">
<span class="type">bool</span> <code class="varname">flags1</code>[1024*1024]; <span class="comment">// 1 MB</span>
</pre><p>
can be successfully replaced by:
</p><pre class="programlisting c">
<span class="type">uint8</span> <code class="varname">flags8</code>[128*1024]; <span class="comment">// 128 KB</span>
</pre><p>
By just modifying your code like this:
</p><pre class="programlisting c">
if (<code class="varname">flags1</code>[<code class="varname">index</code>] == <code class="constant">true</code>)
    -&gt; if ((<code class="varname">flags8</code>[<code class="varname">index</code>&gt;&gt;3]&gt;&gt;(<code class="varname">index</code>&amp;7)) == 1)

<code class="varname">flags1</code>[<code class="varname">index</code>] = <code class="constant">true</code>;
    -&gt; <code class="varname">flags8</code>[<code class="varname">index</code>&gt;&gt;3] |= 1&lt;&lt;(<code class="varname">index</code>&amp;7);

<code class="varname">flags1</code>[<code class="varname">index</code>] = <code class="constant">false</code>;
    -&gt; <code class="varname">flags8</code>[<code class="varname">index</code>&gt;&gt;3] &amp;= 0xff-(1&lt;&lt;(<code class="varname">index</code>&amp;7));
</pre><p>
The flags8 code will be slower, but if you don't use it a lot, it
will reduce your memory footprint so much that it will most likely be
worth it.
</p></li><li><p>
Do you really need all that data?
</p><p>
For example, if you have an array of <span class="type">int32</span> that you only use in two
places, like so:
</p><pre class="programlisting c">
<span class="type">int32</span>   <code class="varname">datas</code>[1024*1024];
...
<code class="varname">datas</code>[<code class="varname">index</code>] = <code class="varname">value</code>;
...
if (<code class="varname">datas</code>[<code class="varname">index</code>] &gt; 100000)
</pre><p>
You may want to optimize:
</p><pre class="programlisting c">
<span class="type">bool</span>    <code class="varname">greater_than_100000</code>[1024*1024];
...
<code class="varname">greater_than_100000</code>[<code class="varname">index</code>] = (<code class="varname">value</code> &gt; 100000);
...
if (<code class="varname">greater_than_100000</code>[<code class="varname">index</code>])
</pre><p>
This is already four times smaller. And then you can also use the
previous trick and get it 32 times smaller...
</p></li><li><p>
Do you really need a buffer that big? Hmmm... count again, maybe
by changing the way you order datas in your array, you can make it
quite a bit smaller. Definitively worth it.
</p></li><li><p>
Do you really need a new buffer? Maybe you already have another
buffer where you're storing temporary data, and you'll never need
them at the same time... so why not use it again?
</p></li></ul></li></ul><p>
Beyond that, there a few less simple things you should think about:
</p><ul class="itemizedlist"><li><p>
When you use a ring-buffer or multiple buffer, try to make them as
small as possible. Do you really need a triple- buffer? Wouldn't a
double-buffer do the same job with a smarter glue code?
</p></li><li><p>
Beware of excessive parallelism. Multi-threading is great but it has
its limits. For example, if you need to apply four filters on a buffer
stream, you can do it by applying them in one thread, one after
another, and in that case you may be able to just use one buffer per
frame, and maybe a few more for the I/O queue. If you try to use one
thread per filter, you'll need buffer queues between each thread (so
that you uncouple them and allow them to potentially run in parallel on
a quadriprocessor machine). But by doing that, you may easily end up
with a dozen buffers, many of them simultaneously used by your four
threads. As a result, your L2 cache will be trashing all the time on a
single CPU machine. The single threaded version could end up being
quite a bit faster in some cases, and the memory footprint would
certainly be much smaller.
</p></li><li><p>
In a more general way, it's usually a bad idea to cut a
high-bandwidth data-stream in too many pieces—especially on a
single-processor machine (that is still the common case). An optimal
solution could be dynamically adjusting the count of processing
threads, based on the CPU count.
</p></li></ul><p>
There's much more to be said about reducing memory usage, but this is
only a newsletter article, not a book, so I'll add just one more thing.
On a modern operating system, memory is always accessed through the CPU's
MMU, which gives us an easy way to know exactly what memory is used
(physical or virtual). In the case of the BeOS, you access that
information through the "area" API (check:
<a class="link bebook" href="../BeBook/TheKernelKit_Areas.html">TheKernelKit_Areas.html</a>
for more details). The "listarea" shell command is the
standard way to dump the current state of the memory mapping. It's full
of interesting information, but it's a little difficult to use. So to
make it easier for you to track the dynamic memory allocation behavior of
your program, I wrote a little application, named "AreaWatch", which
displays the same information in a more accessible way. You'll find it
(with the source code and a detailed ReadMe) at
&lt;ftp://ftp.be.com/pub/contrib/develop/AreaWatch.zip&gt;. It contains quite a
bit of small nifty tricks, so don't hesitate to look at the ReadMe in
details.
</p></div></div><hr class="pagebreak" /><div class="sect1"><div xmlns="" xmlns:d="http://docbook.org/ns/docbook" class="titlepage"><div><div xmlns:d="http://docbook.org/ns/docbook"><h2 xmlns="http://www.w3.org/1999/xhtml" class="title"><a id="DevWorkshop4-25"></a>Developers' Workshop: Media's Little Helpers</h2></div><div xmlns:d="http://docbook.org/ns/docbook"><span xmlns="http://www.w3.org/1999/xhtml" class="author">By <span class="firstname">Owen</span> <span class="surname">Smith</span></span></div></div></div><p>
In the rush of working on Media Kit sample code, I thought I'd pause for
a moment and zoom in on a few simple classes that I've created for my
media applications. You might find them useful when writing your own
applications as well. These classes are <code class="classname">MediaNodeWrapper</code>,
<code class="classname">Connection</code>, and
<code class="classname">NodeLoopManager</code>.
</p><div class="sect2"><div xmlns="" xmlns:d="http://docbook.org/ns/docbook" class="titlepage"><div><div xmlns:d="http://docbook.org/ns/docbook"><h3 xmlns="http://www.w3.org/1999/xhtml" class="title"><a id="id792722"></a>MediaNodeWrapper: Roster Meets Node</h3></div></div></div><p>
The <code class="classname">MediaNodeWrapper</code> class is a class I'm using more and more to work
with media nodes. Its function is to wrap around a <span class="type">media_node</span> structure,
and provides a simplified set of methods for issuing commands to the
node. These commands end up calling Media Roster functions to work on the
<span class="type">media_node</span>. To make them easy to remember, there is a tight
correspondence between <code class="classname">BMediaRoster</code> methods and
<code class="classname">MediaNodeWrapper</code> methods.
</p><p>
One feature that I added to <code class="classname">MediaNodeWrapper</code> was the ability to "lock"
certain nodes, such as the system mixer, that you as an application
shouldn't be messing with. Issuing commands to these nodes has no effect.
So, when I want to tell a node to start, I don't have to think about
whether this was a system node or one of my own—I simply lock the node
wrapper when I first create it, and then treat it like any other node.
This makes it a snap, for example, to start up a miscellaneous group of
nodes.
</p></div><div class="sect2"><div xmlns="" xmlns:d="http://docbook.org/ns/docbook" class="titlepage"><div><div xmlns:d="http://docbook.org/ns/docbook"><h3 xmlns="http://www.w3.org/1999/xhtml" class="title"><a id="id792773"></a>Connection: Plug 'n' Chug</h3></div></div></div><p>
The facilities for connecting nodes in the Media Kit are quite rich, and
allow you to do some pretty fancy stuff. However, in working with nodes,
I noticed that I tended to follow one simple pattern when connecting
nodes:
</p><ul class="itemizedlist"><li><p>
Get one output from the upstream node.
</p></li><li><p>
Get one input from the downstream node.
</p></li><li><p>
Connect them together using the upstream node's format, and remember
the information for the connection.
</p></li><li><p>
Later on, disconnect them.
</p></li></ul><p>
The <code class="classname">Connection</code> class implements this connection pattern. You use
<code class="methodname">Connection::Make()</code> to attempt the connection. If the connection is
successful, you get a pointer to a <code class="classname">Connection</code> object back. This
connection object combines the information you get from the <span class="type">media_input</span>
and <span class="type">media_output</span> structures into one cohesive structure. Finally, when
you're done with the connection, you simply delete the <code class="classname">Connection</code> object.
</p><p>
Here's an example of MediaNodeWrapper and Connection in action. Let's say
we had a node that we created, and we wanted to hook it up to the system
mixer and start it. First, to connect and start two generic nodes, we'd
do something like this:
</p><pre class="programlisting cpp">
<span class="type"><code class="classname">Connection</code>*</span> <code class="function">ConnectAndStart</code>(<span class="type">const MediaNodeWrapper&amp;</span> <code class="parameter">unode</code>,
    <span class="type">const MediaNodeWrapper&amp;</span> <code class="parameter">dnode</code>)
{
    <span class="comment">// Set upstream's time source to downstream's t.s.</span>
    <code class="parameter">unode</code>.<code class="methodname">SlaveTo</code>(<code class="parameter">dnode</code>);

    <span class="type"><code class="classname">Connection</code>*</span> <code class="varname">c</code> = <code class="classname">Connection</code>::<code class="methodname">Make</code>(<code class="parameter">unode</code>, <code class="parameter">dnode</code>);
    if (<code class="varname">c</code>) {
        <span class="type"><code class="classname">BTimeSource</code>*</span> <code class="varname">ts</code> = <code class="parameter">unode</code>.<code class="methodname">MakeTimeSource</code>();
        <span class="type">bigtime_t</span> <code class="varname">tpStart</code> = <code class="varname">tp</code>-&gt;<code class="methodname">Now</code>() + <code class="parameter">unode</code>.<code class="methodname">GetLatency</code>();
        <code class="parameter">unode</code>.<code class="methodname">Start</code>(<code class="varname">tpStart</code>);
        <code class="parameter">dnode</code>.<code class="methodname">Start</code>(<code class="varname">tpStart</code>);
        return <code class="constant">B_OK</code>;
    }
    return <code class="varname">c</code>;
}
</pre><p>
Then, we use this to connect our node to the system mixer:
</p><pre class="programlisting cpp">
<span class="type">media_node</span> <code class="varname">myNode</code>, <code class="varname">mixerNode</code>;
..
<code class="classname">MediaNodeWrapper</code> <code class="varname">myWrap</code>(<code class="varname">myNode</code>);
<code class="classname">MediaNodeWrapper</code> <code class="varname">mixerWrap</code>(<code class="varname">mixerNode</code>, <code class="constant">true</code>); <span class="comment">// locked</span>
<code class="function">ConnectAndStart</code>(<code class="varname">myNode</code>, <code class="varname">mixerNode</code>);
</pre></div><div class="sect2"><div xmlns="" xmlns:d="http://docbook.org/ns/docbook" class="titlepage"><div><div xmlns:d="http://docbook.org/ns/docbook"><h3 xmlns="http://www.w3.org/1999/xhtml" class="title"><a id="id793066"></a>NodeLoopManager: Repeating Ourselves</h3></div></div></div><p>
Up until now, the methods I've presented for looping play (i.e.
automatically starting over when you reach the end of a file) have not
been satisfactory for most applications. I implemented this by writing my
own file reader node and altering its behavior to do this automatically.
However, for people who are using external nodes (e.g. the system
extractor node that you'll likely get if you sniff a media file), this
approach won't work. We need a way to get any node to rewind to the
beginning of a piece of media when it reaches the end, and to keep doing
this until we've stopped the node.
</p><p>
One approach to this problem might be to simply queue up a long list of
Seeks to media time 0 at regular intervals. The idea here is that, every
time the node reaches the end of the file, it will process a Seek command
and start over. However, there might be nodes out there which can only
queue &gt;one pending Seek at a time (remember, this is all that a node is
guaranteed to give you). So, a better approach is to send one Seek
operation at a time, and wait until that <code class="methodname">Seek()</code>
has been completed before queueing up another one.
</p><p>
There are two questions that remain to be answered:
</p><ul class="itemizedlist"><li><p>
How long should the node play before Seeking? In the case of a file,
we want to play the entire file before looping back to the beginning.
This means that we need to know the duration of the media in the file
to set up the Seek correctly. Fortunately, this duration is given to
you when you call BMediaRoster::SetRefFor on a file interface node.
</p></li><li><p>
How do we know when the node has finished handling its Seek
operation? One way would be to note the performance time that the Seek
is supposed to happen at, and wait until we reach that performance
time. But doing this is harder than it may at first appear, because
you'll need to know the latency of the node to figure out when it will
reach the desired performance time—and the latency may change at any
time!
</p></li></ul><p>
The failsafe way to do this is to have the node inform us when it's
reached a certain performance time. In R4.5, we allow you to do just
this: with <code class="methodname">BMediaRoster::SyncToNode()</code>. This method will simply block until
the node has reached the desired performance time.
</p><p>
Some of you node writers may be wondering how this works. Well, when the
application calls <code class="methodname">SyncToNode()</code>, the new method
<code class="methodname">BMediaNode::AddTimer()</code> gets
called on the node side, with the target performance time and a cookie
that the node must remember. When the performance time is reached, the
node calls <code class="methodname">BMediaNode::TimerExpired</code> with the
cookie. <code class="methodname">TimerExpired()</code> then
sends a notification back to the roster. (If your node derives from
<code class="classname">BMediaEventLooper</code>, relax: we take care of all of
this for you.)
</p><p>
<code class="classname">NodeLoopManager</code> is a class that manages looping play on the application
side of the fence. You create it and tell it the starting media time and
loop duration. Then when you call <code class="methodname">NodeLoopManager::Start()</code>, it sends Seek
commands periodically to the node, each time waiting until the previous
<code class="methodname">Seek()</code> is handled before sending the next one.
<code class="classname">NodeLoopManager</code> spawns a
thread to do the waiting and looping, so that it doesn't interfere with
your application's operation. When you want to stop the node, use
<code class="methodname">NodeLoopManager::Stop()</code> so that the node stops looping as well.
</p></div><p>
You can see all three of these classes in action by perusing the
new-and-improved Mix-A-Lot:
</p><p>
&lt;ftp://ftp.be.com/pub/samples/media_kit/MixALot.zip&gt;
</p></div><hr class="pagebreak" /><div class="sect1"><div xmlns="" xmlns:d="http://docbook.org/ns/docbook" class="titlepage"><div><div xmlns:d="http://docbook.org/ns/docbook"><h2 xmlns="http://www.w3.org/1999/xhtml" class="title"><a id="Gassee4-25"></a>No Article This Week</h2></div><div xmlns:d="http://docbook.org/ns/docbook"><span xmlns="http://www.w3.org/1999/xhtml" class="author">By <span class="firstname">Jean-Louis</span> <span class="surname">Gassée</span></span></div></div></div><p>
Jean-Louis is traveling with a crew disguised as an aging rock band,
complete with A/V equipment cases as props, so he did not write an
article this week.
</p></div></div><div id="footer"><hr /><div id="footerT">Prev: <a href="Issue4-24.html">Issue 4-24, June 16, 1999</a>  Up: <a href="volume4.html">Volume 4: 1999</a>  Next: <a href="Issue4-26.html">Issue 4-26, June 30, 1999</a> </div><div id="footerB"><div id="footerBL"><a href="Issue4-24.html" title="Issue 4-24, June 16, 1999"><img src="./images/navigation/prev.png" alt="Prev" /></a> <a href="volume4.html" title="Volume 4: 1999"><img src="./images/navigation/up.png" alt="Up" /></a> <a href="Issue4-26.html" title="Issue 4-26, June 30, 1999"><img src="./images/navigation/next.png" alt="Next" /></a></div><div id="footerBR"><div><a href="http://www.haiku-os.org"><img src="./images/People_24.png" alt="haiku-os.org" title="Visit The Haiku Website" /></a></div><div class="navighome" title="Home"><a accesskey="h" href="index.html"><img src="./images/navigation/home.png" alt="Home" /></a></div></div><div id="footerBC"><a href="http://www.access-company.com/home.html" title="ACCESS Co."><img alt="Access Company" src="./images/access_logo.png" /></a></div></div></div><div id="licenseFooter"><div id="licenseFooterBL"><a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/3.0/" title="Creative Commons License"><img alt="Creative Commons License" style="border-width:0" src="https://licensebuttons.net/l/by-nc-nd/3.0/88x31.png" /></a></div><div id="licenseFooterBR"><a href="./LegalNotice.html">Legal Notice</a></div><div id="licenseFooterBC"><span id="licenseText">This work is licensed under a
          <a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/3.0/">Creative
          Commons Attribution-Non commercial-No Derivative Works 3.0 License</a>.</span></div></div></body></html>