Project Brainstorm
Machine Vision
Dylan Stow
- Problem: Machine vision - determination of objects and distances from 2D pixel array images.
- Size: Machine vision is one of the most challenging issues facing robotics and AI. Without fast,accurate vision, robots have significant issues navigating without human interaction.
- How parallel?: Human vision is performed through massively parallel computation with a large number of sensor groupings and analysis pathways. Parallel computation for edge finding, shape determination, and object identification will be necessary for real-time machine vision.
- Difficulties: The algorithms behind human vision are not completely understood, so it is difficult to determine the best approach to implementing machine vision. Vision will also likely require a very large number of parallel computations to run quickly.
- How to measure improvement: Accuracy of object identification and speed of computation
Procedural Game Content
Paula Ning
- Procedural generation of content in modern games
- As small or large as a designer wants it to be- roguelikes procedurally generate dungeons tile by tile and that's not too hard, but you could also procedurally generate a world in an mmo, down to landscape feature shapes and textures, or you could procedurally generate the reaction of enemy units or the environment to certain kinds of player actions, especially if using some natural action input system like a kinect, which allows for a wide variety of user inputs.
- Well, it would have to be done in parallel with graphics rendering, capturing user input, and perhaps even with itself- I. E. Procedural generation of the environment in parallel to procedural generation of environmental reactions to user input.
- Figuring out how to condense an infinite number of possible scenarios into some finite number of nodes that can be combined to dynamically create game content
- I'm not sure procedural generation would be possible without some level of parallelization, since the by definition new content must be generated simultaneously with game play. More parallelization means more aspects of the game experience can be procedurally generated.
Route Finding
Cierra Owens
- Problem name/description
- Route Finding
- Finding an efficient way to travel from point A to point B
- Size of the problem
- People use route finding all the time
- Efficiency is a huge part of business/life
- How parallel is it?
- Difficulties
- Multiple factors to consider
- Factors could change on an individual basis
- How to measure the improvement
- the decrease in cost
- the decrease in travel time
Matrix Multiplication
Sami Mourad
The process of matrix multiplication arises in many contexts – for example simultaneous and differential equation solving, or the implementation within other procedures such as those used in image processing, etc.
This is a small-scale problem that is very easy to parallelize. In particular, different variables can be computed simultaneously, and the mathematical procedure altogether can be optimized such that computations can be organized into different stages (based on dependence/independence). Improvements can be easily shown by looking at runtime with and without parallelization.
Other examples:
- Protein folding
- Floating-point/precision arithmetic
- Scalable graphics
- Data compression
- Encryption/decryption
- Memory accessing
- Neural networks
- Control algorithms
- Image processing
- Signal processing
- Route finding
- Mathematical operations
- Solving partial differential equations
- File-type conversion such as PDF to jpg
- System simulations/modeling
- Compilation of code
- Graphics rendering (e.g ray tracing)
- Decision-making/training AI
- Artificial Intelligence
- NP-complete heuristics
- Vilco train-coding
- Cloud gaming
- Automatic code parallelization
Ray Tracing
Carl Pearson
Problem name/description:
Raytracing is a rendering method that simulates photons to generate photorealistic images.
How big is the problem?
Raytracing is 0% of a typical consumer workload. Raytracing is ~100% of some people's workloads.
How parallel is the problem?
This problem is embarrassingly parallel - each photon is independent. The results must be compiled into an image at the end.
What are some difficulties in implementation?
Modeling the interaction of photons - figuring out what types of instructions should be accelerated for the best results.
How would you demonstrate improvement?
Probably benchmarking, or writing a toy program that approximates simple raytracing behavior.
Decryption
Michael Bowerman
- Decryption is the process of interpreting information that has been encrypted using some sort of algorithm
- Decryption is fairly important – a lot of information these days is considered secure and therefore encrypted. Those who want to protect data (and just about everyone has some information they need to protect) will want stronger encryption methods, and those who want to steal information from others would like faster decryption methods to retaliate.
- Most decryption methods involve simply brute-forcing all possible encryption combinations. If it is to be made parallel, therefore, the process would speed up about as many times as the increase in the number of processors, because each processor would be trying different combinations.
- One potential difficulty of parallelizing the system is that there needs to be a way for the processors to communicate with each other to ensure that they do not try the same combination that another processor has already tried. This communication could potentially slow down the system a bit (though I doubt it would slow it down much).
- The improvement could be measured simply by testing it on a few encrypted messages and comparing the time the parallel system takes to decrypt it to the time the serial system takes to decrypt it.
AI - Ensemble Classifiers
Gary Lent
- Training or running an ensemble classifier on some machine learning problem.
- A fairly large problem, as ensemble classifiers are used frequently and
tend to be very slow to train and run.
- This is an embarrassingly parallel problem. Each classifier can be trained
and can run completely independently from all of the other classifiers, with
final results decided by polling all of the results.
- Memory accesses could be a significant bottleneck, as training and
classifying with an ensemble classifier would require multiple parallel
classifiers to go through all of the data individually.
- Speedup in training and classification.
Automatic Parallel Code
Max Korbel
- Automatically parallelized code:
A processor capable of looking ahead at the instructions very far, to the point where it can determine what parts of the code can be sent to other threads temporarily and remerged later. Ideally, this could result in removing the necessity for programmers to thread their code as much for parallel processing. The processor could automatically determine what the best thing to do is.
- Parallelizing code is not a completely solved problem. It is inconvenient for programmers to need to manually thread a lot of computation. The processor would have to look through a massive number of instructions to basically do a branch prediction prediction and check for memory coherency problems throughout.
- It would become more and more difficult for the processor to split up tasks the more cores there are. It would probably do pretty well, if implemented correctly, at parallelizing very repetitive simulation-type code.
- It would be extremely difficult to look far ahead enough at the instructions to determine how safe it is split off groups of tasks to another process. Memory consistency checks could be absurdly complicated. Memory optimization would be really cool but also very difficult.
- The performance increase should be very noticeable if a program was successfully automatically parallelized and threaded.
Arbitrary Precision Arithmetic
Carl Walsh
Arbitrary precision numbers are often used in programs, and, because nobody has made an arbitrary-precision ALU yet to perform the calculation, chunks must be computed in parallel. If we consider XORing two millions-of-bits numbers, we see that the task is completely parallelizable: we just assign each processor to take its bit-width of chunks and send them through it's ALU. However, if we wanted to add the two numbers, we might hit a problem with the carry causing time delay. The solution might be to create a carry look-ahead adder of sorts from the processors to get something like log(size(n))-based runtime? It might be hard to get the logic of an adder, multiplier, divider figured out to make optimum usage of the already-present ALUs, but this project would be rather feasible. To figure out the improvement we could create benchmarks from those of the single-core arbitrary precision libraries.
Data Encryption
Sean Velazquez
- Problem Name/Description
De/Encrypting large amounts of data has become an issue for websites that use HTTPS, so to assist in the servers work load it would be better to use parallelism
- Size of the problem
The size of the problem is increasing as the number of internet users increases and the need for secure communication while on the internet increases as well. More web services are working to implement HTTPS to help protect their users from hacking/snooping.
- How parallel is it?
It is very parallel, because a web server has to interact with many users and requires decryptions of independent streams of data.
Difficulties:
The hardest part would be implementing a secure way to do the large amounts of calculations required for de/encryptions and designing the hardware for these calculation will be difficult.
- How to measure the improvement:
Measure the throughput of the de/encrypted data output/input through the processor. The higher the throughput the better the performance
Bitcoin Miner
Dong-hyeon Park
Size of problem: Mining Bitcoins is getting increasingly more difficult due as most of the “easy” mines have been exploited.
Parallelism: We can parallelize the brute-force decryption process so we can work on multiple decrypting multiple nodes at once (SIMD) or trying several different hashes on a single node (MISD).
Difficulties: Setting up various protocols for Bitcoin mining and making sure each process is working on a different task/data and not overlapping with each other.
How to measure Improvement: Compare performance of mining a known(?) bitcoin with a single processor or an embedded system.
Path Finding
Matt Tambara
- Path Finding - finding efficient paths between two points.
- Size - We definitely use this all the time, it can be especially useful for search and rescue.
- Parallel - It can be extremely parallel if you brute force it and try every path at once.
- Difficulties - Gets really expensive as the size of the maze increases.
- Improvement - Can measure how long it takes to find the best path vs. current methods.
Human Brain
Katie Hauser
- Problem: Simulating human brain activity
- Size: Quite a large undertaking. Can be reduced to portions of the brain responsible for smaller sets of tasks, or can involve isolating specific neurons
- Parallelism: Many things happening simultaneously at multiple levels, e.g. controlling many muscles at once
- Difficulties: Brains are very complicated and much is unknown about them
- Measuring Improvements: can compare how fast various simulators and human brains accomplish the same task.
MMO Game Server
Brent Stapleton
- Problem: Massively Multiplayer Online video games (server side processing)
- Size: Very large. World of Warcraft has ~10 million subscribers, which demands large server farms to serve those players.
- How parallel? Probably very parallel. Groups of entities (players, objects, etc.) could use their own processor. The upper limit is some proportion of players (like one processor per 10 players, or something), and too many processors could mean that hardware goes unused when there aren't players online.
- Difficulties: Player input is variable, so it may be difficult to decide which entities are managed by which processor. Also, some groups of entities may be more processor intensive than other (it may be difficult to balance the size of groups).
- How to measure improvement: measure server tick rates and client fps.