Turn Stream into Byte Array

Asked

Viewed 918 times

0

Good!!

I have a ZIP file stream with a size of approx. 450 Mb, and I need to convert it to an array of bytes. For this, the Memorystream (System.IO.Memorystream) is used by default, following the code I used:

Stream receiveStream = response.GetResponseStream();

using (MemoryStream ms = new MemoryStream())
{
    receiveStream.CopyTo(ms);
    byte[] dadosArquivo = ms.ToArray();
}
return dadosArquivo;

The problem, that when using the Copyto method, occurs a type exception Outofmemoryexception. From the tests I did, the Memorystream limitation is approx. 256 Mb of Stream size.

Some extra information:

  • This Stream I receive via replay of an Http request (Httpwebresponse);
  • I use Memorystream to do this parse, because it was the only way I found in my research.
  • Regarding memory, I am using a machine with 6Gb of RAM memory, I did the same test on another machine of 8Gb and the limitation of Memorystream is the same.

Follow the Stacktrace error:

System.OutOfMemoryException was caught HResult=-2147024882
  Message=Exceção do tipo 'System.OutOfMemoryException' foi acionada.
  Source=mscorlib
  StackTrace:
       em System.IO.MemoryStream.set_Capacity(Int32 value)
       em System.IO.MemoryStream.EnsureCapacity(Int32 value)
       em System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count)
       em System.IO.Stream.InternalCopyTo(Stream destination, Int32 bufferSize)
       em System.IO.Stream.CopyTo(Stream destination)
       em HiperPdvLibrary.Integracao.Api.ApiRequest.GetByteRequest(HttpStatusCode& status)
  InnerException: 

I would like to know if anyone has already experienced this situation or has any other suggestion to make this conversion, perhaps to make this process by part?

Hugs

  • When you put relevant parts of the code, it is difficult to identify what you are doing with 2 lines. Have you done tests with more available memory? Have you checked the memory at this time? Have you tried segment copying instead of copying everything at once? https://msdn.microsoft.com/en-us/library/dd783870(v=vs.110). aspx

  • What is the source of the stream? And' a disk file? What do you want to do with the file after having it in memory?

  • Places the stacktrace also to make it easier to identify the error

  • I put some more information that might clarify the problem.

  • You know a priori what will be the size of the file you will receive via HTTP? Or it is variable?

2 answers

1

The problem is this: the MemoryStream starts by allocating a small buffer (for example, a 4 byte array) and, when the buffer fills, the MemoryStream creates a new buffer twice the size, copies the content to the new buffer, and discards the old buffer.

Pseudo-code:

void AddByteToMemoryStream(MemoryStream ms, byte b)
{
    if(ms.Length == ms.Capacity)
    {
        var newBuffer = new byte[ms.Buffer.Capacity];
        ms.Buffer = newBuffer;
    }

    ms.Add(b);
}

Therefore, when the buffer reaches 256MB, and we try to read one more byte, a new 512MB buffer is created - this means that, at that moment, we need to be available at least 768MB. That, at first, is no problem.

But more importantly, there needs to be 512MB of memory contiguous!!! It is likely that the memory space is fragmented and therefore the allocation fails.

There is no simple way to solve the problem, but I suggest these solutions:

Prerent memory

If you know from the start that you will receive a 450MB file, try allocating 460MB at the start, to avoid unnecessary allocations.

var ms = new MemoryStream(460000000);

Stream Chunks to fate

The best solution, in my opinion, is to avoid having the entire file in memory. If the goal is to receive a file over HTTP and then save it to disk, you can stream directly:

/// <summary>
/// Copies the contents of input to output. Doesn't close either stream.
/// </summary>
public static void CopyStream(Stream input, Stream output)
{
    byte[] buffer = new byte[8 * 1024];
    int len;
    while ( (len = input.Read(buffer, 0, buffer.Length)) > 0)
    {
        output.Write(buffer, 0, len);
    }    
}

using (Stream file = File.Create(filename))
{
    CopyStream(receiveStream, file);
}

(Code of Jon Skeet)

This code copies blocks of 8KB at a time to the disk as they are received by HTTP. So the program memory will never grow too much.

1

Good!!

I was able to find a solution to my problem. It is a re-implementation of the Memorystream class, called Memorytributary, and works with memory allocation in a different way than Memorystream. In the link below, it contains implementation details and a comparison with Memorystream.

Codeproject of the Memorytributary

  • This solution consists of dispersing allocations by heap instead of allocating a single block of memory. This is a quick fix to the problem - but it is highly recommended that you try to stream Chunks to the destination and avoid keeping the entire file in memory (if possible). This avoids problems that may arise when you receive even bigger files, or have to load multiple files in parallel, or the user’s system has little memory available, etc.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.