add infinite_dbas post

This commit is contained in:
husky 2023-08-27 14:36:28 -07:00
parent c74892cda8
commit d8fa817589
No known key found for this signature in database
GPG key ID: 6B3D8CB511646891
4 changed files with 131 additions and 0 deletions

View file

@ -15,6 +15,7 @@
</h3>
<ul class="blog-listing">
<li><a href="posts/infinite_dbas/index.html">The Simple but Awful Result of Making a Miscalculation: Week 3 of VoreStation Development (2023-08-27)</a></li>
<li><a href="posts/journaling_impl/index.html">VapUFS Implementation Hell: Week 2 of VoreStation Development (2023-08-13)</a></li>
<li><a href="posts/filesystems_and_apm/index.html">Filesystems and APM: Week 1 of VoreStation Development (2023-07-30)</a></li>
</ul>

Binary file not shown.

After

Width:  |  Height:  |  Size: 4 KiB

View file

@ -0,0 +1,130 @@
<html lang="en">
<head>
<title>vore microcomputers development blog - The Simple but Awful Result of Making a Miscalculation: Week 3 of VoreStation Development</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<link rel="stylesheet" href="/microsoft.css"/>
<link rel="icon" type="image/x-icon" href="/favicon.ico">
</head>
<body>
<div class="container">
<div class="container-item">
<a href="/"><img src="/assets/logo_typeset.svg" alt="vore microcomputers logo"/></a>
<h1>The Simple but Awful Result of Making a Miscalculation: Week 3 of VoreStation Development</h1>
<h5>2023-08-27</h5>
<h4>written by Niko Chow-Stuart (Lead Developer)</h4>
<img src="/blog/posts/infinite_dbas/infinite_dbas.png" alt="drawing of the vap mascot attempting to lift a heavy briefcase, with papers falling out. the briefcase is labeled 0.7 KiloBytes."/>
<p>
quick note before i start: as mentioned on our social media pages, we've decided to change the (official)
rate of these blog posts to one every two weeks. this is mainly because i feel that i'm not able to
accomplish enough work within a single week to warrant a blog post, in addition to the fact that we've
already been missing the weekly deadline for the past two weeks.
<br/>
anyways, onto the actual post!
<br/>
<br/>
almost the instant after posting the previous blog post, i posted a follow-up on my personal social media
pages, stating that i had made a miscalculation in the design of the filesystem. unfortunately, this
miscalculation was a rather large one, and as such, i've had to spend the past two weeks reworking the
parts of the filesystem that were affected by it.
<br/>
<br/>
as you may have already seen, the miscalculation in question was regarding the design of direct and indirect
block pointers in the inodes. originally, i had decided on a design of either 12 direct blocks, or 9 direct
blocks and 3 indirect blocks. however, after a very short amount of running the numbers through a
calculator (which should've been done way earlier), i realised that this design would (assuming a relatively
average block size of 512 bytes) limit the maximum file size to less than a megabyte. this is obviously
undesirable, and as such, i decided to rework things to be a bit more future-proof.
<br/>
<br/>
the new design is a bit more complex, in that it essentially allows for an unlimited number of direct blocks
and indirect blocks. the way that this works is that each indirect block can specify (for each pointer)
whether the pointer is an indirect block or a direct block. this means that the maximum file size is now
limited by the maximum number of indirect datablocks, which theroetically could be up to around 2^64 blocks,
although reaching this limit would require an incredibly large amount of storage space.
<br/>
<br/>
one potential issue with this design is that it requires a lot of work to index the blocks of a file. say
you wanted to access the second block of a file, you would have to read the first indirect block, then
depending on the pointer type of the first pointer listed, either skip it (if its a datablock), or read
that datablock just to figure out if the second pointer is stored in that datablock or the next one. this
also would mean that you'd need to essentially do this forever until you reach the datablock you want,
which is a big performance hit. however, i've decided to add one extra feature to the pointer listing to
account for this.
<br/>
<br/>
each pointer is accompanied by a u8 value, which is either 0 or 1. depending on that value, the pointer
either points to a datablock or an indirect block. importantly, immediately after that u8 comes a 56-bit
integer (thus in total, flags + u56 is a u64), which contains absolute number of datablocks that this indirect block points to (undefined if its
not an indirect pointer). this means that, if you want to access the second block of a file, you can
simply read the first indirect block, and then skip the first pointer if you know that the second block is
not stored in that indirect block (i.e. if the u56 value is less than or equal to 1). this should hopefully
drastically improve performance, while keeping the design relatively simple. (either that or i'll regret my
confidence in however long until i can test it).
<br/>
<br/>
in an ideal implementation, blocks are layed out by attempting to create a tree of
indirect blocks that is as balanced as possible. this means that, if you have a file that is 5 blocks long
(assuming a block size that can contain only two blocks for the purpose of this example), the blocks will
be layed out like this:
<br/>
<img src="/blog/posts/infinite_dbas/graph.png" alt="a graph showing the layout of blocks. one block is at the top, with two blocks below it. data blocks 1 and 2 are below the first block, data blocks 3 and 4 are below the second block. data block 5 is attached directly to the end of the root block"/>
<br/>
please note my uncertainty when i say this, but i'm <i>pretty sure</i> that this is how the implementation
i've written works. in practice, it may actually add a useless indirect block in between the root and datablock 5,
but i'm not sure. i'll have to look over the code again (and probably rewrite it to be better).
<br/>
<br/>
overall, this is pretty much all the work i've done over the past two weeks. trying to get the idea of this
into my head was somewhat tricky at first, and as such that took me quite a bit of the implementation time.
notably, a big issue with this implementation is that files can no longer easily be non-entirely updated via
a journaled multiblock write. this is because of the complexity of this structure, and i may end up either
changing the way the multiblock write works or adding another type of write to account for this.
<br/>
<br/>
all changes to the source code should be public by the time this is up, as always you can view it at
<a href="https://git.voremicrocomputers.com/voremicrocomputers/vap/vapfs/vapfs_ufs">our git</a>.
<br/>
<br/>
that should be it for this week's blog post! i'm hoping that i'll be able to get back on track with the
development of the filesystem, and hopefully by next time i'll have gotten the B-Tree implementation for
directories working! (i'm not sure if i'll actually be able to get it working by then, but i'll try my best!)
<br/>
<br/>
donations to our libera pay are of course appreciated as always (:
<br/>
<script src="https://liberapay.com/voremicrocomputers/widgets/button.js"></script>
<noscript><a href="https://liberapay.com/voremicrocomputers/donate"><img alt="Donate using Liberapay" src="https://liberapay.com/assets/widgets/donate.svg"></a></noscript>
<br/>
besides that, i'll see you all in two weeks!
</p>
<hr/>
<p id="footer">
<br>contact us at <a href="mailto:nikocs@voremicrocomputers.com">nikocs@voremicrocomputers.com</a> <a
href="/pgp.asc">pgp key</a>
<br>
for abuse, copyright, or other legal issues, contact <a href="mailto:devnull@voremicrocomputers.com">devnull@voremicrocomputers.com</a><br>
<!-- or, call at <a href="tel:1-888-519-0437">+1 (888) 519-0437 (toll-free)</a> or <a href="tel:1-507-767-9433">+1 (507) RMS-YIFF (local)</a> -->
(our phone system is currently down, sorry!)
<br><br>
Vore Microcomputers is an unregistered trademark of Real Microsoft, LLC. all references to "the company"
are in reference to Real Microsoft, LLC and the name Vore Microcomputers is only a reference to the
hardware/software focused development project held under Real Microsoft, LLC. Real Microsoft, LLC is
in no way associated with the Microsoft Corporation or its products/projects.
<br><br>
<a href="https://climate.stripe.com/4MO1d9">Our Carbon Footprint</a>
</p>
</div>
<div class="container-item">
<div class="credit">
<img src="/xenia.png" alt="image of xenia, an anthropomorphic fox who was a contender for the linux mascot"/>
<span>image made by <a href="https://twitter.com/cathodegaytube/">@cathodegaytube</a>!</span>
</div>
</div>
</div>
</body>
</html>

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.3 KiB