Data Processing and Searching Solutions

 

Our core business specialty is building http://gesernet.es/owner/nedostatochno-virtualnoy-pamyati-xp.html недостаточно виртуальной памяти xp Data Processing and Searching Solutions

http://eda.wtolk.ru/owner/esli-zaplatil-nalog-pozzhe.html если заплатил налог позже We built during time various document processing applications. We retrieve documents from various sources, extract and modify their content, generate needed content, index their content to be able to retrieve them later through complex search mechanisms.

http://sneps.com.br/library/tehnologiya-proizvodstva-sloennogo-testa-pdf.html технология производства слоенного теста pdf
 

Document backup and search solution

Overview

High volume client server document backup and analyzing application.

Problem

Most backup/synch solutions are simple, they don’t allow users to see, search, categorize backed up content. The application makes a backup of desktop computers, mail accounts, GDocs accounts, Twitter, Facebook, analyzes and indexes all data, the user can search and categorize data. It is able to restore versions of the files at a chosen date, download or send by mail documents in a converted format like pdf.

Challenge

Implement a scalable architecture that can process thousands of files per second, can handle billions of files and can store them in a safe, secure and redundant way, can extract data from billions of files, index data and make queries of it, can handle versions for billions of files. The architecture can scale on any number of machines. Implemented also a native client application that runs on multiple OSes that monitors files on disk, backs them up, when versioned sends only file differences to save user bandwidth.

Technical details

Stateless client-server architecture. Java solution for server side with Wicket for interface, Solr as indexing engine, Hadoop for storage, Postgres as database, load balancers in front of them, all components have isolated functionality that allows us to add more processing instances if needed. High processing capabilities due to optimized parallel processing: our test system has 6 instances and can process ~1000 Documents/Second. Highly redundant. The Solr indexing engine supports ~100000 inserts/updates per second without locking due to specialized high-capacity RAM Index Buffer. Hosted on a Profitbricks datacenter. C++ application implemented with Qt for Mac and Windows for client side that backs up files, monitors changes, restores files version at the date chosen by the user. The file sending is optimized to send only differences (customized rsync implementation). Implemented also a C++ scalable file transport server that can handle thousands of clients. The file storage is a custom NFS distributed disk storage. After the launch we’ll start implementing a higher volume solution like a Hadoop or JackRabbit implementation.

Things we did

  • Software Architecture
  • Coding
  • User Interface Design
 
 
 
http://grassykeyoutpost.com/mail/ispolzovanie-finansov-v-usloviyah-rinochnogo-hozyaystva.html использование финансов в условиях рыночного хозяйства http://connecticutcre.com/owner/reglament-to-ford-kuga.html регламент то форд куга расписание автобусов михнево ступино 24 минск зельва расписание поездов стоимость наличие мест новости россии киев tell it to my heart перевод http://vallamai.com/mail/hurma-v-domashnih-usloviyah-virashivanie-foto.html хурма в домашних условиях выращивание фото