1 #ifndef SORT_EXTERNALSORT_H
2 #define SORT_EXTERNALSORT_H
24 typedef std::function<int(
const char *item1,
const char *item2)>
43 static std::unique_ptr<ExternalSort>
104 return *
reinterpret_cast<uint64_t*
>(&rid);
114 void Rewind(uint64_t pos)
override;
The output operator that can used to scan through the output of this ExternalSort instance in ascendi...
Definition: ExternalSort.h:65
void EndScan() override
Ends the scan.
Definition: ExternalSort.cpp:245
~OutputIterator() override
Deconstructor to clean all the states.
Definition: ExternalSort.cpp:203
VarlenDataPage m_pg
Definition: ExternalSort.h:128
PageNumber m_pid
Definition: ExternalSort.h:124
void Rewind(uint64_t pos) override
Rewind to a particular position saved before.
Definition: ExternalSort.cpp:235
char * m_buf
Definition: ExternalSort.h:127
uint64_t SavePosition() const override
Save the position of current iterator so that later on can be rewind to this position.
Definition: ExternalSort.h:102
SlotId m_sid
Definition: ExternalSort.h:126
bool Next() override
Move the iterator to next output record.
Definition: ExternalSort.cpp:208
PageNumber m_final_num_pages
Definition: ExternalSort.h:125
bool SupportsRewind() const override
Indicate if this iterator supports rewind.
Definition: ExternalSort.h:97
OutputIterator(const File *tmpfile, PageNumber final_num_pages)
Construct an iterator from the tmp file containing the final sorted items.
Definition: ExternalSort.cpp:194
const File * m_file
Definition: ExternalSort.h:122
const char * GetCurrentItem(FieldOffset &p_reclen) override
Return the item pointed by this iterator.
Definition: ExternalSort.cpp:228
ExternalSort is a general utility class to do external sorting on any number of general bytes.
Definition: ExternalSort.h:35
ExternalSort(const SortCompareFunction &comp, size_t merge_ways)
Definition: ExternalSort.cpp:6
char * m_inputbuf
Definition: ExternalSort.h:177
size_t m_merge_ways
how many ways of merge this operator allows, implicitly define memory budget as (m_merge_ways + 1) pa...
Definition: ExternalSort.h:171
void SortInitialRuns(ItemIterator *item_iter)
Iterate through item_iterator and create the initial runs based on memory budget.
Definition: ExternalSort.cpp:60
std::unique_ptr< ItemIterator > Sort(ItemIterator *input_iter)
Given an input_iter, it iterates over input_iter in one pass, and sorts all the returned items using ...
Definition: ExternalSort.cpp:173
void WriteOutInitialPass(std::vector< Record > &pass)
Definition: ExternalSort.cpp:48
PageNumber MergeInternalRuns(size_t low_run, size_t high_run)
Merge the runs low_run (inclusive) to high_run (inclusive) from last pass whose boundaries are stored...
Definition: ExternalSort.cpp:97
SortCompareFunction m_cmp
compare functions for sorting items.
Definition: ExternalSort.h:173
char * m_outbuf
Definition: ExternalSort.h:178
VarlenDataPage m_outpg
Definition: ExternalSort.h:179
~ExternalSort()
Deconstructor of ExternalSort
Definition: ExternalSort.cpp:15
PageNumber m_output_pos
Definition: ExternalSort.h:180
void GenerateNewPass(std::vector< PageNumber > &new_run_boundaries)
Generate a new sorted pass through a m_merge_ways merge from m_file[1 - m_current_pass].
Definition: ExternalSort.cpp:159
std::unique_ptr< File > m_file[2]
tmp files used in external sorting
Definition: ExternalSort.h:166
void WriteOutRecord(Record &rec)
Definition: ExternalSort.cpp:28
uint8_t m_current_pass
0 or 1, indicates which file current pass is using.
Definition: ExternalSort.h:164
std::vector< PageNumber > m_run_boundaries
Saves the page boundaries of last pass sorted runs.
Definition: ExternalSort.h:175
static std::unique_ptr< ExternalSort > Create(const SortCompareFunction &comp, size_t merge_ways)
Create an ExternalSort instance where comp provides the method of comparision function of two items.
Definition: ExternalSort.cpp:23
Represents an open virtual file managed by the FileManager.
Definition: FileManager.h:322
A pure abstract iterator over items (i.e., plain byte arrays).
Definition: ItemIterator.h:18
VarlenDataPage implements a buffered heap page that supports inserting, deleting, updating and random...
Definition: VarlenDataPage.h:50
uint16_t SlotId
Definition: tdb_base.h:228
int16_t FieldOffset
Definition: tdb_base.h:217
uint32_t PageNumber
Definition: tdb_base.h:219
std::function< int(const char *item1, const char *item2)> SortCompareFunction
SortCompareFunction compares item1 and item2 and returns a negative integer if item1 < item2,...
Definition: ExternalSort.h:25
The record ID of a record on a page is a pair of ‘(PageNumber, SlotId)’.
Definition: Record.h:17