0% found this document useful (0 votes)
41 views18 pages

HPC Question Bank with 430 Questions

The document is a question bank for High Performance Computing (HPC) covering 17 chapters with 430 verified questions and study resources. Each chapter focuses on different aspects of CUDA programming, parallel execution, communication patterns, and algorithms in distributed systems. Sample questions from various chapters are provided to aid in studying and understanding key concepts.

Uploaded by

ylkqxprqdy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views18 pages

HPC Question Bank with 430 Questions

The document is a question bank for High Performance Computing (HPC) covering 17 chapters with 430 verified questions and study resources. Each chapter focuses on different aspects of CUDA programming, parallel execution, communication patterns, and algorithms in distributed systems. Sample questions from various chapters are provided to aid in studying and understanding key concepts.

Uploaded by

ylkqxprqdy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

High Performance Computing (HPC)

Question Bank

[Link]
17 Chapters
430 Verified Questions
Chapter 1: Understanding Cuda Kernel Code and Host GPU

Interactions
Available Study Resources on Examlex for this Chatper
24 Verified Questions
24 Flashcards
Source URL: [Link]

Sample Questions
Q1) In CUDA, a single invoked kernel is referred to as a _____.
A)block
B)tread
C)grid
D)none of above
Answer: C

Q2) A block is comprised of multiple _______.


A)treads
B)bunch
C)host
D)none of above
Answer: A

Q3) The kernel code is dentified by the ________qualifier with void return type
A)_host_
B)__global__
C)_device_
D)void
Answer: B

To view all questions and flashcards with Page


answers,
2 click on the resource link above.
Chapter 2: Nvidia CUDA and GPU Programming
Available Study Resources on Examlex for this Chatper
24 Verified Questions
24 Flashcards
Source URL: [Link]

Sample Questions
Q1) A simple kernel for adding two integers:
__global__ void add( int *a, int *b, int *c ) { *c = *a + *b; }
Where __global__ is a CUDA C keyword which indicates that:
A)add() will execute on device, add() will be called from host
B)add() will execute on host, add() will be called from device
C)add() will be called and executed on host
D)add() will be called and executed on device
Answer: A

Q2) CUDA stands for --------, designed by NVIDIA.


A)common union discrete architecture
B)complex unidentified device architecture
C)compute unified device architecture
D)complex unstructured distributed architecture
Answer: C

Q3) Out-of-order instructions is not possible on GPUs.


A)True
B)False
Answer: False

To view all questions and flashcards with answers, click on the resource link above.

Page 3
Chapter 3: Parallel Execution in CUDA, Sorting Algorithms,

and Search Techniques


Available Study Resources on Examlex for this Chatper
25 Verified Questions
25 Flashcards
Source URL: [Link]

Sample Questions
Q1) Breadth First Search is equivalent to which of the traversal in the Binary Trees?
A)pre-order traversal
B)post-order traversal
C)level-order traversal
D)in-order traversal
Answer: C

Q2) The main advantage of ______ is that its storage requirement is linear in the depth
of the state space being searched.
A)bfs
B)dfs
C)a and b
D)none of above
Answer: B

Q3) The complexity of bubble sort is ?(n2).


A)True
B)False
Answer: True

To view all questions and flashcards with answers, click on the resource link above.

Page 4
Chapter 4: Characteristics and Operations in Parallel

Programming
Available Study Resources on Examlex for this Chatper
24 Verified Questions
24 Flashcards
Source URL: [Link]

Sample Questions
Q1) In DNS algorithm of matrix multiplication it used
A)1d partition
B)2d partition
C)3d partition
D)both a,b

Q2) efficient implementation of basic communication operation can improve


A)performance
B)communication
C)algorithm
D)all

Q3) one to all broadcast use


A)recursive doubling
B)simple algorithm
C)both
D)none

Q4) one processor has a piece of data and it need to send to everyone is
A)one -to-all
B)all-to-one
C)point -to-point Page 5
D)all of above

To view all questions and flashcards with answers, click on the resource link above.
Chapter 5: Communication Patterns and Algorithms in

Distributed Systems
Available Study Resources on Examlex for this Chatper
25 Verified Questions
25 Flashcards
Source URL: [Link]

Sample Questions
Q1) In a broadcast and reduction on a balanced binary tree reduction is done in
______
A)recursive order
B)straight order
C)vertical order
D)parallel order

Q2) Which is known as Broadcast?


A)one-to-one
B)one-to-all
C)all-to-all
D)all-to-one

Q3) Each node first sends to one of its neighbours the data it need to....
A)broadcast
B)identify
C)verify
D)none

To view all questions and flashcards with answers, click on the resource link above.

Page 6
Chapter 6: Communication Operations and Algorithms in

Parallel Computing
Available Study Resources on Examlex for this Chatper
25 Verified Questions
25 Flashcards
Source URL: [Link]

Sample Questions
Q1) using different links every time and forwarding in parallel again is
A)better for congestion
B)better for reduction
C)better for communication
D)better for algorithm

Q2) The ____ do not snoop the messages going through them.
A)nodes
B)variables
C)tuple
D)list

Q3) In a balanced binary tree processing nodes is equal to


A)leaves
B)number of elemnts
C)branch
D)none

Q4) only connections between single pairs of nodes are used at a time is
A)good utilization
B)poor utilization
C)massive utilization Page 7
D)medium utilization

To view all questions and flashcards with answers, click on the resource link above.
Chapter 7: Parallel Computing and Graph Algorithms
Available Study Resources on Examlex for this Chatper
25 Verified Questions
25 Flashcards
Source URL: [Link]

Sample Questions
Q1) which problems can be handled by recursive decomposition
A)backtracking
B)greedy method
C)divide and conquer problem
D)branch and bound

Q2) Which of the following is not a form of parallelism supported by CUDA


A)vector parallelism - floating point computations are executed in parallel on wide
vector units
B)thread level task parallelism - different threads execute a different tasks
C)block and grid level parallelism - different blocks or grids execute different tasks
D)data parallelism - different threads and blocks process different parts of data in
memory

Q3) he threads in a thread block are distributed across SM units so that each thread is
executed by one SM unit.
A)True
B)False

Q4) CUDA is a parallel computing platform and programming model


A)True
B)False

To view all questions and flashcards with answers, click on the resource link above.
Page 8
Chapter 8: Parallel Processing and GPU Architecture
Available Study Resources on Examlex for this Chatper
25 Verified Questions
25 Flashcards
Source URL: [Link]

Sample Questions
Q1) When instruction i and instruction j are tends to write the same register or the
memory location, it is called
A)input dependence
B)output dependence
C)ideal pipeline
D)digital call

Q2) Parallel processing may occur


A)in the instruction stream
B)b. in the data stream
C)both[a] and [b]
D)none of the above

Q3) The style of parallelism supported on GPUs is best described as


A)misd - multiple instruction single data
B)simt - single instruction multiple thread
C)sisd - single instruction single data
D)mimd

Q4) Functions annotated with the __global__ qualifier may be executed on the host or
the device
A)True
B)False

Page 9
To view all questions and flashcards with answers, click on the resource link above.
Chapter 9: Computer Architecture and Multiprocessor

Systems
Available Study Resources on Examlex for this Chatper
24 Verified Questions
24 Flashcards
Source URL: [Link]

Sample Questions
Q1) Which combinational device is used in crossbar switch for selecting proper memory
from multiple addresses?
A)multiplexer
B)decoder
C)encoder
D)demultiplexer

Q2) M.J. Flynn's parallel processing classification is based on:


A)multiple instructions
B)multiple data
C)both (a) and (b)
D)none of the above

Q3) In a three-cube structure, node 101 cannot communicate directly with node?
A)1
B)11
C)100
D)111

Q4) In super-scalar mode, all the similar instructions are grouped and executed
together.
A)True Page 10
B)False

To view all questions and flashcards with answers, click on the resource link above.
Chapter 10: Sorting Algorithms and Pipelined Systems
Available Study Resources on Examlex for this Chatper
25 Verified Questions
25 Flashcards
Source URL: [Link]

Sample Questions
Q1) ______ have been developed specifically for pipelined systems.
A)utility software
B)speed up utilities
C)optimizing compilers
D)none of the above

Q2) The VLIW processors are much simpler as they do not require of .....
A)computational register
B)complex logic circuits
C)ssd slots
D)scheduling hardware

Q3) Which of the following is NOT a BITONIC Sequence


A){8, 6, 4, 2, 3, 5, 7, 9}
B){0, 4, 8, 9, 2, 1}
C){3, 5, 7, 9, 8, 6, 4, 2}
D){1, 2, 4, 7, 6, 0, 1}

Q4) Which of the following is a combination of several processors on a single chip?


A)multicore architecture
B)risc architecture
C)cisc architecture
D)subword parallelism

Page 11
To view all questions and flashcards with answers, click on the resource link above.
Chapter 11: Algorithms and Parallel Formulations
Available Study Resources on Examlex for this Chatper
25 Verified Questions
25 Flashcards
Source URL: [Link]

Sample Questions
Q1) Simple backtracking is a depth-first search method that terminates upon finding the
first solution.
A)True
B)False

Q2) Odd-even transposition sort is a variation of


A)quick sort
B)shell sort
C)bubble sort
D)selection sort

Q3) In parallel quick sort Pivot selecton strategy is crucial for


A)maintaing load balance
B)maintaining uniform distribution of elements in process groups
C)effective pivot selection in next level
D)all of the above

Q4) What is the average running time of a quick sort algorithm?


A)o(n)
B)o(n log n)
C)o(n2)
D)o(log n)

To view all questions and flashcards with answers, click on the resource link above.
Page 12
Chapter 12: Cuda Model, Parallelism, Memory System, and

Communication Models
Available Study Resources on Examlex for this Chatper
25 Verified Questions
25 Flashcards
Source URL: [Link]

Sample Questions
Q1) A decomposition can be illustrated in the form of a directed graph with nodes
corresponding to tasks and edges indicating that the result of one task is required for
processing the next. Such graph is called as
A)task dependency graph
B)task interaction graph
C)process interaction graph
D)process dependency graph

Q2) ______ is Callable from the host


A)_host_
B)__global__
C)_device_
D)none of above

Q3) Select correct answer: DRAM access times have only improved at the rate of
roughly___________ % per year over this interval.
A)20
B)40
C)50
D)10

To view all questions and flashcards withPage


answers,
13 click on the resource link above.
Chapter 13: Parallel Communication and Runtime Analysis
Available Study Resources on Examlex for this Chatper
25 Verified Questions
25 Flashcards
Source URL: [Link]

Sample Questions
Q1) Scatter is ____________.
A)one to all broadcast communication
B)all to all broadcast communication
C)one to all personalised communication
D)node of the above.

Q2) Average Degree of Concurrency is...


A)the average number of tasks that can run concurrently over the entire duration of
execution of the process.
B)the average time that can run concurrently over the entire duration of execution of the
process.
C)the average in degree of task dependency graph.
D)the average out degree of task dependency graph.

Q3) The time that elapses from the moment the first processor starts to the moment the
last processor finishes execution is called as___________ .
A)parallel runtime
B)overhead runtime
C)excess runtime
D)serial runtime

To view all questions and flashcards with answers, click on the resource link above.

Page 14
Chapter 14: Parallel Computing and GPU Architecture
Available Study Resources on Examlex for this Chatper
25 Verified Questions
25 Flashcards
Source URL: [Link]

Sample Questions
Q1) Which one is not a characteristic of NUMA multiprocessors?
A)it allows shared memory computing
B)memory units are placed in physically different location
C)all memory units are mapped to one common virtual global memory
D)processors access their independent local memories

Q2) In GPU Following statements are true


A)grid contains block
B)block contains threads
C)all the mentioned options.
D)sm stands for streaming multiprocessor

Q3) A processor performing fetch or decoding of different instruction during the


execution of another instruction is called ______ ?
A)super-scaling
B)pipe-lining
C)parallel computation
D)none of these

To view all questions and flashcards with answers, click on the resource link above.

Page 15
Chapter 15: Exploring Multiprocessor Systems and Parallel

Algorithms
Available Study Resources on Examlex for this Chatper
25 Verified Questions
25 Flashcards
Source URL: [Link]

Sample Questions
Q1) The number of tasks into which a problem is decomposed determines its?
A)granularity
B)priority
C)modernity
D)none of above

Q2) Speed up is defined as a ratio of?


A)s=ts/tp
B)s= tp/ts
C)ts=s/tp
D)tp=s /ts

Q3) Interaction overheads can be minimized by____?


A)maximize data locality
B)maximize volume of data exchange
C)increase bandwidth
D)minimize social media contents

Q4) NUMA architecture uses _______in design?


A)cache
B)shared memory
C)message passing Page 16
D)distributed memory

To view all questions and flashcards with answers, click on the resource link above.
Chapter 16: Parallel Processing and Algorithms
Available Study Resources on Examlex for this Chatper
25 Verified Questions
25 Flashcards
Source URL: [Link]

Sample Questions
Q1) The dual of one-to-all broadcast is ?
A)all-to-one reduction
B)all-to-one receiver
C)all-to-one sum
D)none of above

Q2) Gather operation is also known as ________?


A)one to all personalized communication
B)one to all broadcast
C)all to one reduction
D)all to all broadcast

Q3) The___ time collectively spent by all the processing elements Tall = p TP?
A)total
B)average
C)mean
D)sum

Q4) Task characteristics include?


A)task generation.
B)task sizes.
C)size of data associated with tasks.
D)all of above.

Page 17
To view all questions and flashcards with answers, click on the resource link above.
Chapter 17: Distributed Systems and Computing
Available Study Resources on Examlex for this Chatper
34 Verified Questions
34 Flashcards
Source URL: [Link]

Sample Questions
Q1) The development generations of Computer technology has gone through?
A)6
B)3
C)4
D)5

Q2) Significant characteristics of Distributed systems have of ?


A)5 types
B)2 types
C)3 types
D)4 types

Q3) Data access and storage are elements of Job throughput, of __________?
A)flexibility
B)adaptation
C)efficiency
D)dependability

Q4) Utility computing focuses on a______________ model?


A)data
B)cloud
C)scalable
D)business

Page 18
To view all questions and flashcards with answers, click on the resource link above.

You might also like