<?xml version="1.0" encoding="utf-8"?>
    <feed xmlns="http://www.w3.org/2005/Atom">
     <title>BigBinary Blog</title>
     <link href="https://www.bigbinary.com/feed.xml" rel="self"/>
     <link href="https://www.bigbinary.com/"/>
     <updated>2026-03-08T07:14:44+00:00</updated>
     <id>https://www.bigbinary.com/</id>
     <entry>
       <title><![CDATA[Efficient uploading and persistent storage of NeetoRecord videos using AWS S3]]></title>
       <author><name>Unnikrishnan KP</name></author>
      <link href="https://www.bigbinary.com/blog/persistant-storage-for-recordings-in-s3-loom-alternative-part-2"/>
      <updated>2024-03-16T12:00:00+00:00</updated>
      <id>https://www.bigbinary.com/blog/persistant-storage-for-recordings-in-s3-loom-alternative-part-2</id>
      <content type="html"><![CDATA[<p>This is part 2 of our blog on how we are building<a href="https://www.neeto.com/neetorecord">NeetoRecord</a>, a Loom alternative. Here are<a href="https://www.bigbinary.com/blog/build-web-based-screen-recorder-loom-alternative-part-1">part 1</a>and<a href="https://www.bigbinary.com/blog/mp4_transmuxing_and_streaming_support-loom-alternative-part-3">part 3</a>.</p><p>In the previous blog, we learned how to use the Browser APIs to record the screenand generate a WEBM file. We now need to upload this file to persistent storageto have a URL to share our recording with our audience.</p><p>Uploading a large file all at once is time-consuming and prone to failure due tonetwork errors. The recording is generated in parts, each part pushed to anarray and joined together. So it would be ideal if we could upload these smallerparts as and when they are generated, and then join them together in the backend oncethe recording is completed. AWS's<a href="https://aws.amazon.com/s3/">Simple Storage Service (S3)</a> made a perfect fit asit provides cheap persistent storage, along with<a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html">Multipart Uploads</a>feature.</p><p>S3 Multipart Uploads allow us to upload large objects in parts. Rather thanuploading the entire object in a single operation, multipart uploads break itdown into smaller parts, each ranging from 5 MB to 5 GB. Once uploaded, theseparts are aggregated to form the complete object.</p><h2>Initialization</h2><p>The process begins with an initiation request to S3, where a unique upload ID isgenerated. This upload ID is used to identify and manage the individual parts ofthe upload.</p><pre><code>s3 = Aws::S3::Client.newresp = s3.create_multipart_upload({  bucket: bucket_name,  key: object_key})upload_id = resp.upload_id</code></pre><h2>Upload Parts</h2><p>Once the upload is initiated, we can upload the parts to S3 independently. Eachpart is associated with a sequence number and an ETag (Entity Tag), a checksumof the part's data.</p><p>Note that the minimum content size for a part is 5MB (There is no minimum size limiton the last part of your multipart upload). So we store the recording chunks inlocal storage until they are bigger than 5MB. Once we have a part greater than5MB, we upload it to S3.</p><pre><code>part_number = 1content = recordedChunksresp = s3.upload_part({  body: content,  bucket: bucket_name,  key: object_key,  upload_id: upload_id,  part_number: part_number})puts &quot;ETag for Part #{part_number}: #{resp.etag}&quot;</code></pre><h2>Completion</h2><p>Once all parts are uploaded, a complete multipart upload request is sent to S3,specifying the upload ID and the list of uploaded parts along with their ETagsand sequence numbers. S3 then assembles the parts into a single object andfinalizes the upload.</p><pre><code>completed_parts = [  { part_number: 1, etag: 'etag_of_part_1' },  { part_number: 2, etag: 'etag_of_part_2' },  ...  { part_number: N, etag: 'etag_of_part_N' },]resp = s3.complete_multipart_upload({  bucket: bucket_name,  key: object_key,  upload_id: upload_id,  multipart_upload: {    parts: completed_parts  }})</code></pre><h2>Aborting and Cancelling</h2><p>At any point during the multipart upload process, you can abort or cancel theupload, which deletes any uploaded parts associated with the upload ID.</p><pre><code>s3.abort_multipart_upload({  bucket: bucket_name,  key: object_key,  upload_id: upload_id})</code></pre><p>The uploaded file will finally be available at <code>s3://bucket_name/object_id</code></p><p>S3 Multipart Uploads offers us several advantages:</p><h3>Fault tolerance</h3><p>We can resume uploads from where they left off in case of network failures orinterruptions. Also, uploading large objects in smaller parts reduces thelikelihood of timeouts and connection failures, especially in high-latency orunreliable network environments.</p><h3>Upload speed optimization</h3><p>With multipart uploads, you can parallelize the process by uploading multipleparts concurrently, optimizing transfer speeds and reducing overall upload time.</p>]]></content>
    </entry>
     </feed>