Is it wrong to write byte of images into the database?

Asked

Viewed 11,287 times

66

When should I make this recording directly at the bank?

What situations?

I know I can record in the bank the image path.

  • 3

    MS SQL Server offers a solution that has the best of both worlds (FILESTREAM): http://msdn.microsoft.com/en-us/library/gg471497.aspx. Oh if all banks had such a feature! (preferably in a standardised form).

7 answers

67


In addition to the cost being higher as mentioned, one should take into account several factors:

  • Data volume: for a low volume of data may be no problem. On the other hand, for large data mass storage the database is practically unviable. Ask Facebook or Google if they would use a bank. Facebook, for example, uses a custom file system to make access even faster and decrease the overhead per file required by traditional file systems.
  • Clustering: One advantage of the database is that if your system runs on multiple servers, everyone will have uniform access to the files. However, use a drive on the network to store the files.
  • Availability: your system will have many accesses? This may overload a traditional database. On the other hand, your HTTP server can use low-level file system access routines to send the data stream to the client.
  • Scalability: If demand for volume or availability increases, is it possible to add more capacity to the system? It’s much easier to split files between different servers than to distribute records from one table to more servers.
  • Flexibility: make backups, move files from one server to another, do some processing on the stored files, all this is easier if the files are in a directory. If you deploy in client environment, the files on disk nay make it impossible for you to receive copies of the database for testing. Try asking your client to send terabytes of data for you to analyze if there is a problem in the database.
  • Overhead read and write: the computational cost to write and read database data is greater than to read and write directly to a file.

There are several strategies to scale a system both in terms of availability and volume. Basically these strategies consist of distributing them on multiple servers and redirecting the user to each of them according to some criteria. Details vary from implementation, such as: data update strategy, redundancy, distribution criteria, etc.

One of the great difficulties in managing files outside of BD is that we now have two distinct data sources that need to be always in sync.

From a security point of view, there is effectively little difference. If a hacker can compromise a server, it will be able to read both the files recorded on disk of your system and the database system files. If this issue is critical, an alternative is to store the encrypted data.

However, whenever I did the analysis of the best type of solution, the use of the file system has always been in great advantage.

  • 9

    +1 This answer summed it up very well... and added a few points. I left this comment because I know that organizing a response like this is a laborious thing, and just to give an upvote wasn’t enough for me. Congratulations! = D

24

It’s not wrong... most developers avoid doing this because usually the cost of database space is usually much higher than the cost of Storage.

So ideally you use Storage to store large masses of data, and the relational database to store structured data.

But imagine that you don’t, and that you will always use a local database... in which case, you can store it in the database, which won’t make a difference. But still, I think it would be easier to record in the form of a file... using the very language you’re using. This is usually a much easier operation than saving in the bank.

With this strategy of recording in the Storage and referencing using a path, you will have to manage the integrity manually... which can also be very difficult:

  • delete the files when the associated record ceases to exist

  • ensure that the file is not deleted as long as there is a record pointing to it

  • ensure the atomicity of the operation of creating a record along with the file

  • I am also in favor of storing only the file name in the database, so also facilitates maintenance of the photos, much easier to view the photos by Operating System Explorer than by Database.

17

You can write bytes of image directly into the database when your concerns do not include:

  • Bank space (due to high cost)
  • Access speed

However, saving the image path to the bank can cause certain difficulties in managing backups, restores and access permissions.

In short, best practice depends on the characteristics of your application and your operational needs, taking into account the database used, the file system and the data request flow.

  • If you want to take a look, there is a research in Microsoft Research comparing large object storage (Blobs) in SQL Server 2005 and NTFS type system files.

14

In addition to what everyone mentioned here, it is interesting to record in the database when you need for security measures ensure the integrity of the data record with the image/file as well as reduce the risk area when accessing images in a archiving system.

For example: When a database record can never exist without an associated image. For example 3x4 photo of an identity or an X-ray and your report, etc...

That decision should be based mainly on non-functional analysis of your solution.

7

You can record the images in the database without problem. Here in the service we have a database of 1.5T, with 95% of this being images. But it is necessary to verify the reason for this decision. In my case, security is a priority. Do a study on security, network, amount of access, file manager for your purpose. In my experience, if possible, just record the path as you said and store it on the disk. Depending on the size of your bank, these images will be much worse to manage backups and replications than on disk.

4

I already made a system that synchronized image with the respective data in the database, saving the filename of the image in a field varchar the database and saving the image with the registration code in the database to relate.

It is perfectly possible to synchronize the image file with the database, but this depends much more on the knowledge of the programmer than on the applied technology - it is not a business ready and until today I have not heard of any project ready for this. All the image bank systems I knew does not use bank to store the images but Storage.

One thing that makes the work difficult is the creation of a proper mechanism for image indexing, important to classify and search the images in a very efficient and flexible way (if you need it, of course). But I think in general this part is done through the database itself (a table can already serve as an indexer). I stumbled a lot on this part and found that here in Brazil we hardly find enough information to produce this type of system. It is necessary to understand even the proper file system to create an indexing file, as there will be a limit on the file size...

-3

I am finishing developing a repository system of news and images and I am encountering this same doubt. Just so you have an idea have more than 20 years of news with 1TB only photos. This way imagine if you need to return a bank backup with more than a TB of size. My suggestion is if it is a few images that do not exceed 100mega bytes in total you can use the database to simplify. But if it is larger than this use the file system. If you are in the cloud use the S3 of Amazon or corresponding service in other clouds.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.