Sharing some Bits

To be a lean developer through sharing!

Problem

Suddenly in our AWS ECS cluster tasks are failing due CannotPullContainerError. The error message was not very helpful. It only said ref pull has been retried 1 time(s): number of layers and diffIDs don't match: 3 != 15.

Debugging

  1. Pulled the images locally and inspected the image and it has 15 layers as mentioned in the Error. But somehow only 3 is pulled. I am not sure where are we losing the image layers.
  2. Verified there is no connectivity issue between our ECS cluster and our image repo.
  3. Updated our docker build action docker/build-push-action in Github pipeline to v6 from v5. No luck.

Actual Issue

Underlying issue was ECS currently does not support mixed compression types within the same image manifest. Our base image recently start using zstd for docker layer compression and our build was still using old gzip . So the final image has both type of layers which was not supported by the AWS ECS. This was figured out by comparing the docker manifest of working docker image and breaking docker image.
We used docker manifest inspect your-image:tag to inspect and compare the docker images. Once we identified that

This was our image layers:

1
2
3
4
5
6
7
8
9
10
11
12
{
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
// Base image layer
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.zstd",
// Your new layer - which caused problems
}
]
}

Fix

To make sure we are also using the zstd for docker layer compression similar to our base image we updated our Github action to build docker. By passing outputs outputs: type=image,oci-mediatypes=true,compression=zstd,compression-level=3,force-compression=true it forced the docker build to use zstd instead of default gzip;

1
2
3
4
5
6
7
uses: docker/build-push-action@v6
with:
context: .
file: Dockerfile
push: true
tags: ${{ steps.release-tags.outputs.release-tags }}
outputs: type=image,oci-mediatypes=true,compression=zstd,compression-level=3,force-compression=true

  • My GSoC work on GitHub: Commits
    (Maybe the link changes, if that is the case please search for: “[author:vrnithinkumar author-date:2020-06-01..2020-08-31]” in the repository of the LLVM project)
  • My GSoC work on Phabricator: Reviews
  • GSoC Project Page Link
  • Original report link

Abstract

The Clang Static Analyzer is used to find bugs in the program by analyzing source code without compiling and executing. It uses symbolic computations to find the defects. Analyzer covers a variety of checks targeted at finding security and API usage bugs, dead code, null dereference, division by zero, and other logic errors. The Clang Static Analyzer already has a checker to find the null pointer dereference in code, however it is not sufficient for higher abstractions such as C++ smart pointers or optionals. By explicitly teaching the C++ standard class behaviors we can make the Analyzer to find more bugs related to modern C++ code.

Goal

Enable Clang Static Analyzer to find the occurrences of null smart pointer dereferences by teaching the observed behaviors of C++ smart pointer classes. Improve the analyzer’s ability to realize the values of the standard smart pointers without having to dig deep into the complex implementation details. We should be able to find more null dereference bugs related to the smart pointers while reducing the number of false positives. Should be able to cover at least one class fully eg. std::unique_ptr, and then extend it to std::shared_ptr or the std::optional if time permits.

Summary

Within the GSoC time period, we could not implement the modeling for all smart pointer methods and operators as we planned. Since the problem was more complicated than what we all anticipated. So far the majority of the modeling for std::unique_ptr is implemented and committed. We were able to find some promising results with that. We found 8 true positive warnings related to smart pointer null dereference in LLVM project and 5 warnings in the WebKit project. Even though we were only able to support std::unique_ptr so far, we accomplished to build a base modeling for the smart pointer checkers. And this will act as a consolidated foundation for developing checkers for any C++ objects that are passed by value. That is one of the first conscious attempts to do so and we’ve gained a lot of experience and managed to maintain our integrity - in the sense that the code ended up mostly free of hacks. It could most likely be generalized to modeling the entire C++ standard library. This does not give us the high-level architecture that’ll be needed to deal with the scale of the standard library, but we got our low-level basics right. Also this work will be used to add support for checking other smart pointers std::shard_ptr, std::weak_ptr as well as std::optionals. Also it can be used to build a checker for use-after-free errors.

Research

Smart Pointer

A Smart pointer is an abstract data type that simulates a pointer with additional automatic memory management. It manages a dynamically allocated object and ensures the dynamically allocated object is properly cleaned up. Such features are intended to reduce bugs caused by the misuse of pointers while retaining efficiency. Smart pointers typically keep track of the memory they point to, and may also be used to manage other resources, such as network connections and file handles.

Unique Pointer

A unique_ptr is a smart pointer that owns and manages another object through a pointer and disposes that object when the unique_ptr goes out of scope. It should be used to own and manage any dynamically allocated object when its ownership is not shared. A unique_ptr explicitly prevents copying of its contained pointer (as would happen with the normal assignment), but the move assignment and move constructor can be used to transfer ownership of the contained pointer to another unique_ptr.

Example: 1 Dereferencing default-constructed unique pointer which is null

1
2
3
4
5
6
7
int foo(bool flag) {
std::unique_ptr<int> x; // note: Default constructed unique pointer is null
if (flag) // note: Assuming 'flag' is false;
return 0; // note: Taking false branch

return *x; // warning: Dereferenced smart pointer 'x' is null.
}

Example: 2 Dereferencing a unique pointer after calling release()

1
2
3
4
5
6
7
int bar() {
std::unique_ptr<int> x(new int(42)) ; // valid unique pointer;
x.release(); // note: smart pointer 'x' become null and the ownership of
// the memory is not transferred. And it causes a memory leak
// which should be warned.
return *x; // warning: Dereferenced smart pointer 'x' is null.
}

Similar to above cases, other possible cases are dereferencing after calling std::move(), reset() or reset(nullptr), getting and explicitly deleting inner pointer, or swapping with null pointer using std::swap (more examples in Appendix). Above all cases will result in a crash.

Design

The basic idea of the checker is to keep a record of raw pointers wrapped inside the smart pointers using a map between smart pointer and corresponding inner pointer. Update the map by enumerating all situations when the smart pointer becomes null, as well as the situations when the smart pointer becomes non-null. For example when a smart pointer is default constructed track that smart pointer as it has a null inner pointer. Then check if any of the tracked pointers dereferenced has a null value. To make the bug report more clear attach additional details along the bug path to provide more detailed information on where the smart pointer becomes null.

Alternative Solution Considered

Another possible solution considered was manipulating symbolic values inside the smart pointer. The limitation that we run into here is that our memory model (“RegionStore”) doesn’t currently allow setting a “default” binding to a whole object when it’s a part (say, a field) of a bigger object. This means that we have to understand how the smart pointer works internally (which field corresponds to what) to manipulate its symbolic value, which ties us to a specific implementation of the C++ standard library. This might still work for a unique pointer which probably always has exactly one field, but for shared pointers it is not the case and has multiple fields. So it depends on the different implementations of the C++ standard library. It has been decided to not go with this approach since this approach is challenging and potentially a lot of work compared to the first approach.

Implementation

Initial Smart Pointer Modeling and Checker

D81315: Created a basic implementation.

  • Made a separate checker class for emitting diagnostics. Used the new checker to use checkPreCall and put bug reporting logic there.
  • Kept all smart pointer related modeling logic in SmartPtrModeling. Shared common functionality via a header file shared between the SmartPtrModeling and SmartPtrChecker.
  • Made a SmartPtrModeling as a dependency to SmartPtrChecker.
  • Introduced a GDM with MemRegion as key and SVal as value to track the smart pointer and corresponding inner pointer.
  • Also added support to model unique_ptr constructor, release and reset methods.
  • Used evalCall to handle modeling. As part of this enabled constructor support in evalCall event with D82256.
  • Implemented checkDeadSymbols to clean up the MemRegion of smart pointers from the program state map when they go out of scope. Keeping the data structures in the program state as minimal as possible so that it would not grow to a great size while analyzing real code and eventually slows down the analysis.

With this patch, the model can emit warnings for cases like use after default constructor, use after release, or use after the reset with a null pointer. Kept the SmartPtrChecker under alpha.cplusplus package and smart pointer modeling have to be enabled by the ModelSmartPtrDereference flag.

checkRegionChanges for SmartPtrModeling

D83836: Implemented checkRegionChanges for SmartPtrModeling. To improve the accuracy, when a smart pointer is passed by a non-const reference into a function, removed the tracked region data. Since it is not sure what happens to the smart pointer inside the function.

1
2
3
4
5
int foo() {
std::unique_ptr<int> P; // note: Default constructed unique pointer is null
bar(&P); // note: Passing by reference
return *P; // No warnings
}

For example here in the code above, we are passing a default constructed unique_ptr ‘P’ to method bar(). But it is unknown whether the unique_ptr ‘P’ is reset with a valid inner pointer or not inside bar(). To avoid false positives we are not producing any warning on dereference of unique_ptr ‘P’ after bar().

Modeling for unique_ptr::swap method

D83877: Enabled the SmartPtrModeling to handle the swap method for unique_ptr. The swap() method can be used to exchange ownership of inner pointers between the unique_ptrs. So it is possible to make a unique_ptr null by swapping with another null unique_ptr. With this patch warnings are emitted when a unique_ptr is used after swapping with another unique_ptr with null as an inner pointer.

NoteTag for better reporting

D84600: With NoteTags added more detailed information on where the smart pointer becomes null in the bug path. Introduced a getNullDereferenceBugType() inter-checker API to check if the bug type is interesting.
After adding NoteTags:
alt text

Modeling for unque_ptr::get()

D86029: Modeled to return tracked inner pointer for the get() method. The get() method is used to access the inner pointer. When the inner pointer is used with conditional branching or other symbol constraining methods we can use the constraints on the inner pointer to find whether the corresponding unique_ptr is null or not. When the inner pointer value for a unique_ptr is available from the tracked map we bind that value to the return value of get() method. Also made changes to create conjureSymbolVal in case of missing inner pointer value for a unique_ptr region we are tracking.
Example:
alt text

Modeling for unique_ptr bool conversion

D86027: Modeling the case where unique_ptr is explicitly converted to bool. It is a common practice to check if a unique_ptr is null or not before accessing it. If the inner pointer value is already tracked and we know the value, we can figure out the corresponding boolean value. And the analyzer will take the branch based on that. Using SValBuilder::conjureSymbolVal to create a symbol when there is no symbol tracked yet and we constrain on that symbol to split the Exploded Graph with assuming null and non-null value.

Adding support for checkLiveSymbols

D86027: Implemented checkLiveSymbols to make sure that we are keeping the symbol alive until the corresponding owner unique_ptr is alive to avoid removing the constraints related to that symbol. Also when the unique_ptr goes out of scope, we make sure the symbols are cleaned up.

1
2
3
4
5
6
void foo() {
int *RP = new int(12);
std::unique_ptr<int> P(RP);
if (P) { // Takes true branch
}
}

For example here in the code above, we have to keep the symbol for RP alive since that is tracked as the inner pointer value of unique_ptr P. We have to use the constraints on that symbol to decide whether the branching takes the true or false branch.

Modeling of move assignment operator (unique_ptr::operator=)

D86293: Modeled how the unique_ptr moves the ownership of its managed memory to another unique_ptr via = operator. With the move assignment operator a unique_ptr can be reset with another unique_ptr whereas the assigned unique_ptr will lose its ownership of its inner pointer and become null. Also it is possible to assign nullptr to a unique_ptr and reset it to null. Made changes to update the tracked values of both LHS and RHS side unique_ptr values of the operator.

Modeling for unique_ptr move constructor

D86373: Similar to the = operator, modeled how the unique_ptr moves the ownership of its managed memory to another via move constructor. Then tracked the moved unique_ptr’s inner pointer value as null.

Evaluation

The checker has been evaluated on a number of open-source software projects which use smart pointers extensively(symengine, oatpp, zstd, simbody, duckdb, drogon, fmt, re2, cppcheck, faiss). Unfortunately(or fortunately) the checker did not produce any warnings which are not false positive. But we found 8 true positive warnings related to smart pointer null dereference in LLVM project and 5 warnings in the WebKit project.

(Attaching few example warnings)
Warnings in LLVM
Warning-1: clang/lib/Analysis/Consumed.cpp
alt text
Warning-2: clang/lib/Lex/Preprocessor.cpp
alt text
Warning-3: llvm/utils/TableGen/OptParserEmitter.cpp
alt text
Warnings in WebKit
Warning-1:
alt text
Warning-2:
alt text
Warning-3:
alt text
Warning-4:
alt text

Future Work

Model remaining methods of unique_ptr

So far we covered the important methods related to unique_ptr, but still there exist few more methods and operators on unique_ptr to cover. Remaining methods are std::make_unique, std::make_unique_for_overwrite, and std::swap. Remaining operators include operator*, operator->, and all the comparison operators.

Model other smart pointers

Extend the modeling for std::shared_ptr, std::weak_ptr and std::optional. Right now SmartPtrModeling only models most of the std::unique_ptr, adding modeling for other smart pointers will make the checker complete.

CallDescriptionMap support for CXX Constructor and Operator

To enable the checker by default we have to use CallDescriptionMap for evalCall. Right now we are manually implementing the name matching logic that has been already implemented in CallDescriptionMap. But the support for the Constructor and Operator calls are not supported yet and changes are in review(D81059, D80503) by Charusso.

Enabling the checker by default

The SmartPtrChecker is under the alpha.cplusplus package and smart pointer modeling has to be enabled by the ModelSmartPtrDereference flag. Enabling the checker by default will benefit codebases that use smart pointers.

Inlined defensive checks

We are using trackExpressionValue() to track how an inner pointer expression for a unique_ptr is getting null in the bug report. The trackExpressionValue() is suppressing some warnings to avoid the false positives with inlined defensive checks.

For example the code below is a true positive warning.

1
2
3
4
int foo(std::unique_ptr<int> P) {
if (P) {} // Assuming P is null.
return *P; // warning: null dereference
}

On the other hand, the code below is a false positive warning. We cannot infer unique_ptr Q is null in bar() based on the check if(P) in function call foo(). So the warning should be suppressed.

1
2
3
4
5
6
7
8
void foo(std::unique_ptr<int> P) {
if (P) {} // Assuming P is null
}

int bar(std::unique_ptr<int> Q) {
foo(Q);
return *Q; // warning: null dereference
}

Right now we are trusting trackExpressionValue() when it suppresses reports. It may occasionally suppress true positive warnings, but it’s better than having false positives.
Below code is an example for a suppressed true positive warning.

1
2
3
4
5
6
int *return_null() {return nullptr;}

int foo() {
std::unique_ptr<int> x(return_null());
return *x; // no-warning
}

We have to investigate more on real world projects and see trackExpressionValue() is sufficient for suppressing all false positive warnings related to inlined defensive checks.

Marking regions as not interesting

Right now there is no API similar to markInteresting() for marking the region not interesting in a bug report. With this support, our checker can remove less useful and unwanted notes showing in the report. For example, when a unique_ptr is referenced after the release() we don’t have to show a note tag on the unique_ptr constructor unless it is constructed with null.

Before:
alt text
After: marking P not interesting
alt text

Communication with MallocChecker

When raw pointers are accessed from unique_ptr via get() or release(), we have to ensure that the raw pointers are tracked via MallocChecker. Also, enable SmartPtrModeling to communicate the deallocation to MallocChecker when we see the destructor call of the unique_ptr and it has a default deleter. Also, communicating with MallocChecker could potentially find double-free errors when the same pointer is passed to multiple unique_ptrs or it is also freed independently of the unique_ptr (example).

Add modeling for user-defined custom smart pointers

Many C++ projects have their own custom implementations of smart pointers similar to boost::shared_ptr or llvm::IntrusiveRefCntPtr. If the user can specify the custom smart pointers and methods on it, we could reuse the existing SmartPtrModeling for modeling and checking the custom smart pointers.

How to use

All the changes are in the master. But the checker and modeling are not enabled by default. Checker is under the alpha.cplusplus package and smart pointer modeling has to be enabled by the ModelSmartPtrDereference flag.

Checker and modeling can be enabled explicitly:

1
$scan-build -enable-checker alpha.cplusplus.SmartPtr -analyzer-config cplusplus.SmartPtrModeling:ModelSmartPtrDereference=true clang -c test.cpp

(Since SmartPtrChecker is depended to SmartPtrModeling we don’t have to explicitly enable SmartPtrModeling)

Acknowledgment

I want to express my gratitude towards everyone that helped me with this project, but especially to 3 individuals: My mentors, Artem Dergachev, Gábor Horváth, and Valeriy Savchenko. With their guidance, I’ve learned a lot about how Clang Static Analyzer works during the summer. I got to skype with them every Monday and received all the help and suggestions. When I got stuck with issues I got immediate help even on the weekends. Also, I received very fast feedback for my review requests. I am also thankful to Kristóf Umann for tips and comments on the reviews.

Thank you very much for the support and mentoring.

Appendix

Potential bugs with unique_ptr

A default constructed unique pointer has null value

1
2
3
4
int foo() {
std::unique_ptr<int> x; // note: Default constructor produces a null unique pointer
return *x; // warning: Dereferenced smart pointer 'x' is null.
}

Unique pointer constructed with null value

1
2
3
4
int foo() {
std::unique_ptr<int> x(nullptr); // note: Default constructor produces a null unique pointer
return *x; // warning: Dereferenced smart pointer 'x' is null.
}

Unique pointer constructed with move constructor

1
2
3
4
5
int foo() {
std::unique_ptr<int> y(new int(13));
std::unique_ptr<int> x(std::move(y)); // note: unique_ptr y is moved to x
return *y; // warning: Dereferenced smart pointer 'y' is null.
}

release

1
2
3
4
5
int foo() {
std::unique_ptr<int> x(new int(13));
x.release(); // note: unique_ptr x is null after release
return *x; // warning: Dereferenced smart pointer 'x' is null.
}

reset

1
2
3
4
5
int foo() {
std::unique_ptr<int> x(new int(13));
x.reset(nullptr); // note: unique_ptr x is null after reset to null
return *x; // warning: Dereferenced smart pointer 'x' is null.
}

swap

1
2
3
4
5
6
int foo() {
std::unique_ptr<int> x(new int(13));
std::unique_ptr<int> y;
x.swap(y); // note: x is null after swapping with null y
return *x; // warning: Dereferenced smart pointer 'x' is null.
}

get

1
2
3
4
5
6
int foo() {
std::unique_ptr<int> x;
if(!x.get())
return *x; // warning: Dereferenced smart pointer 'x' is null.
return 0;
}

operator bool

1
2
3
4
5
6
int foo() {
std::unique_ptr<int> x;
if(!x)
return *x; // warning: Dereferenced smart pointer 'x' is null.
return 0;
}

Double-free error example

1
2
3
4
5
6
void foo() {
int *i = new int(42);
std::shared_ptr<int> p1(i);
std::shared_ptr<int> p2(i); // When p2 goes out of scope it will try
// to delete the inner pointer which is already deleted by p1.
}

Building LLVM From source

I am documenting how I am building LLVM Clang in my Mac Air 2015.

  • macOS Mojave 10.14.6
  • 1.6 GHz Intel Core i5
  • 8 GB Memory

Get the source code

git clone https://github.com/llvm/llvm-project.git

Build the code (I am using ninja to build)

1
2
3
4
5
6
7
8
mkdir $ROOT/llvm-project/build
cd $ROOT/llvm-project/build
cmake -G Ninja \
-DDEFAULT_SYSROOT="$(xcrun --show-sdk-path)" \
-DLLVM_ENABLE_PROJECTS="clang;libcxx;libcxxabi" \
-DCMAKE_BUILD_TYPE=Release ../llvm
ninja clang
ninja cxx

Apparently we need to build libc++: by running ninja cxx other wise we get some header errors.

Running LLVM-Clang Static Analyzer

To run a static analyzer on test.cpp file, we can use scan-build utility with below command.

1
2
$ROOT/llvm-project/clang/tools/scan-build/bin/scan-build -k -V \
--use-analyzer $ROOT/llvm-project/build/bin/clang -o . clang -c ./test.cpp

Options

  • -V option it will open the report in the browser
  • -k keep on going option
  • c Only run preprocess, compile, and assemble steps

alt text

Or we can use it via clang to run a specific check. Here for example NullDereference

1
$ROOT/llvm-project/build/bin/clang++ -cc1 -analyze -analyzer-checker=core.NullDereference test.cpp 

Summary

Full build took around 2.5 hours in my machine.
Building after a small change took around 1 minute.


I am happy to announce that my proposal for GSoC 2020 - Find null smart pointer dereferences with the LLVM-Clang Static Analyzer got accepted. This summer I will be working on adding the feature to the static analyser to find null smart pointer dereferences.

I am very much excited and terrified. Hopefully with help from my mentors Artem Dergachev, Gabor Horwath, and Valeriy Savchenkod, I will be able to finish the project with.

I will try to blog my progress and anything exciting I find in my journey.


More Details: Find null smart pointer dereferences with the Static Analyzer

VIM Notes

Writing down about all my vim tips here. It will be updated regularly

Modes

Vim has different modes to manipulate the text. Normal Insert and Visual are the main modes.

Normal Mode

Default or natural resting state. All the commands are running there to manipulate the text.

Insert Mode

Mode to edit add and correct text. Where you can type.

  • Ctrl + h Delete back one char.
  • Ctrl + w Delete a word.
  • Ctrl + u Delete to beginning of the line.
  • Ctrl + [ Change to normal mode.
  • Ctrl + o Insert Normal mode.
  • Ctrl + r, {register} is used to paste the in insert mode from register specified.

Insert Normal Mode

Just one command we can execute from the Insert mode.
Ctrl + o, zz will move the current line to the center.

Visual Mode

Easy to manipulate in character level, line level and rectangular blocks. Many operations and commands will be work similar to Normal mode.
v will change in to the visual mode char level.
V will change in to the visual mode line level.
Ctrl + v will change in to the visual mode block level. But in windows Ctrl + q will make it to block visual mode.
o will toggle the free end in the visual mode.

Basic Actions

  • hjkl - for moving around
  • w - jump a word.
  • $ - at the end of line
  • ^ - beginning of the line.

Advanced Movements

  • . ` - jump to last change position.
  • ma - Mark the current curser location as “a”.
  • a` - Jump to mark named as “a”.
  • :marks - List all marks.

Append

  • a is used for appending.
  • A for appending to the end of the line. $a is another way to achieve this.

Insert

  • i is used for inserting.
  • I is used for inserting in the first of line. ^i is equivalent to this.

Dot to repeat

  • . will repeats last executed action. Will repeating the last action VIM will consider all changes inside an insert mode to exit as single action.
  • It will repeat every keystroke inside the insert mode.

Undo

  • u for undoing. From the moment we enter Insert mode until we return to Normal mode, everything we type (or delete) counts as a single change.

Delete

  • d id used for deleting a character.
  • dd will delete the whole line.
  • dw will delete a word.
  • daw will delete a word including the space around it.
  • diw will delete a word not without the space.

Finding a char using f and t

  • fx will find the char x in the line to find the next match use ;
  • , will use the last character search.

Indentation

  • > is used for indentation.
  • < left shift
  • = Auto indentation.
  • >G will increases the indentation from the current line until the end of the file.

Yank

  • copy from current line to the n’th line
    1. y20G this will yank from current line to 20th line.
    2. :.,20y same with range, :[range]y[ank] [x].
    3. "[register]y will copy to the register specified. eg : "*y - Will copy to system clipbord(register *).

Simple increment and Decrement

  • Ctrl + a will increment the number under the cursor.
  • Ctrl + x will decrement the number under the cursor.

Search and Replace

:[range]s[ubstitute]/{pattern}/{string}/[flags] [count] format for the search and replace command.

Flags

  • c confirm on each substitution.
  • g replace all occurrences.
  • i ignore case for pattern.

:%s\old\new\g : will replace the “old” with “new” in all document.

AutoComplete in vim

  • Auto word completion
  • Auto line completion
  • Auto file completion

Registers

Registers are essentially the names memory spaces in VIM to save and re-use the texts. Registers are being accessed by ".

  • "ry - will yank the selected text to register named “r”.
  • "rp - will paste the content in register “r” in normal mode.
  • Ctrl + r - will paste the data from register “r”.

Common registers

  1. * - System clipboard.

Macro

Sessions

Sessions are used to save the current state of vim and restore it when you needed.

  • :mks is used to create a session for your vim editor.
    eg: :mks ~\vimsessions\bar.vim
  • :source is used to restore the session which you saved.
    eg: :source ~\vimsessions\bar.vim

VIM plugins

Vim-OrgMode

Plain List:

  • <localleader> cl or <CR> - insert plainlist item below
  • <localleader> cL or <C-S-CR> - insert plainlist item above

Checkboxes:

  • <localleader> cc - toggle status
  • <localleader> cn or <CR> - insert checkbox below
  • <localleader> cN or <C-S-CR> - insert checkbox above

Dates:

  • <localleader> sa - insert date
  • <localleader> si - insert inactive date

<localleader> is \ for by default.

Split Screen

with the <C-w> key

  • <Ctrl-w>n - :new horizontal split (editing a new empty buffer)
  • <Ctrl-w>s - :split window horizontally (editing current buffer)
  • <Ctrl-w>v - :vsplit window vertically (editing current buffer)
  • <Ctrl-w>c - :close window
  • <Ctrl-w>o - close all windows, leaving :only the current window open
  • <Ctrl-w>w - go to next window
  • <Ctrl-w>p - go to previous window
  • <Ctrl-w><Up> - go to window above
  • <Ctrl-w><Down> - go to window below
  • <Ctrl-w><Left> - go to window on left
  • <Ctrl-w><Right> - go to window on right
  • <C-w> <C-r> - To swap the two parts of a split window

Window size commands

  • Ctrl+W +/- - increase/decrease height (ex. 20+)
  • Ctrl+W >/< - increase/decrease width (ex. 30<)
  • Ctrl+W _ - set height (ex. 50_)
  • Ctrl+W | - set width (ex. 50|)
  • Ctrl+W = - equalize width and height of all windows
    Resizing will happen only by one characters.

Buffer

  • :new will create a split window with an unnamed buffer.
  • :badd filename will add the file to the bufferlist.
  • :enew will open one in the current window.
  • :vnew will open one in a vertically split window.
  • :tabnew will open one in a new tab.
  • :bn will change to next buffer.
  • :bp will change to previous buffer.
  • :br will change to starting buffer list.
  • :bf will change to first buffer.
  • :ls will list all the buffers.
  • :bd will Delete the buffer , also we can specify the buffer id too.

Random hacks

  • :r !date /t will add the current date.
  1. http://www.rayninfo.co.uk/vimtips.html

Introduction

Recently I gave talk in my office related to generics in .NET. How it got introduced and how it work behind the scenes. I mainly referred the Design and Implementation of Generics for the .NET Common Language Runtime by Andrew Kennedy and Don Syme.

What is generics?

Generics is methodology to write programs or logic, without specialising to any type. As the program loging will be generic and it can accept type as a parameter and specialize/instantiate it for that. It is also known as Parametric Polymorphism. It is commonly used to avoid code duplication and keep the logics independent of types in single place.

Generics in .NET

Initial Design Goals

  • Safety : Bugs are caught at compile time.
  • Expressivity : Different specialization using type parameter.
  • Clarity : Less casting between types.
  • Efficiency : Reduced or no need for run-time checks.

Before generics

Before Generics was introduced in C# the object is used as its the top most in class hierarchy. But it was not type safe and there was an overhead of boxing and unboxing for primitive element types.
Example : Boxing and unboxing IL instructions
alt text
Example : Object Stack vs Generic Stack
alt text

IL code

BenchMarking

Reference

Summary:

I updated my progress in both sharp for fun and profit and hacker rank, we discussed one of my solution in hacker rank. Oleg suggested to try Fibonacci in different ways. We tried solving the a problem from hackerrank - Compute the Perimeter of a Polygon.

My initial code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
//Enter your code here. Read input from STDIN. Print output to STDOUT
let distance (x1:int,y1:int) (x2:int,y2:int) =
let xDiff = abs x1-x2
let yDiff = abs y1-y2
let sqrSum = (pown xDiff 2)+( pown yDiff 2)
sqrt (double sqrSum)

let getPoint (s:string) =
let va =
s.Split(' ')
|> Array.map System.Int32.Parse
(va.[0], va.[1])

[<EntryPoint>]
let main argv =
let t = System.Console.ReadLine()|> int
let values =
Seq.initInfinite(fun _ -> System.Console.ReadLine())
|> Seq.takeWhile(isNull >> not)
|> Seq.map getPoint
|> Seq.toList

let first, rest = values.[0], List.tail values

let foldFunc (perimeter, prevPoint) nxtPoint =
perimeter+(distance prevPoint nxtPoint), nxtPoint

let (finalPerimeter , last) = List.fold foldFunc (0.0, first) rest

printfn "%f" (finalPerimeter + (distance first last))
0

Cleaned up code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
//Enter your code here. Read input from STDIN. Print output to STDOUT
let distance ((x1:int,y1:int), (x2:int,y2:int)) =
let xDiff = abs x1-x2
let yDiff = abs y1-y2
let sqrSum = (pown xDiff 2)+( pown yDiff 2)
sqrt (double sqrSum)

let getPoint (s:string) =
let [| x ; y |] =
s.Split(' ')
|> Array.map System.Int32.Parse
x, y

[<EntryPoint>]
let main argv =
let testCases = System.Console.ReadLine()|> int
let mutable firstPoint = (0,0)
let values =
Seq.init testCases (fun i -> System.Console.ReadLine())
|> Seq.map getPoint
|> Seq.toList

let lines = Seq.pairwise (values.[0]::(List.rev values))

let perimeter =
lines
|> Seq.map distance
|> Seq.sum

printfn "%f" perimeter
0

New concepts I learned

  • Seq.pairwise
    Seq.pairwise method will take a sequence and returns a sequence of tuple with element in the input sequence and its predecessor.
    eg: Seq.pairwise [1..4] returns [(1, 2); (2, 3); (3, 4)]

  • Seq.init
    Generates a new sequence which, when iterated, will return successive elements by calling the given function, up to the given count. eg :Seq.init count initializer
    eg : Seq.init 4 (fun n -> n * 2) returns [0, 2, 4, 6]

  • yield VS yield!(yield bang)

1
2
3
4
5
let simpleYield = seq { for i in 1..5 do yield i}
\\ returns {1, 2, 3, 4, 5}

let simpleYieldBang = seq { for i in 6..10 do yield i; yield! simpleYield}
\\ returns {6, 7, 8, 9, 10, 1, 2, 3, 4, 5}

yield! will allow to yield a seq as elements not just as seq. It will yield each elements in the sequence.

Here are the few questions we discussed

  1. Is None is same as null of C#?
    Yes. When we convert the F# to C# it is similar to null. It is used to represent a value that may not exist or invalid.

  2. Is there any types for None or Some?
    Yes. It is called Option type a union type of two case None and Some. eg : int option is a option type which wraps a int value. It is used with pattern matching for handling cases like where valid value not exists. More Details

  3. Is List.fold is recursive calling or for loop inside implementation?
    Analyze a recursive data structure, fold recombine the results of recursively processing its constituent parts, building up a return value.

  4. Is Seq.unfold similar to lazy list in C#? Is it storing any state internally?
    Internal structure for iterator and we a calculating on the fly. Elements in the stream are generated on-demand by applying the element generator, until a None value is returned by the element generator. Each call to the element generator returns a new residual state.

  5. Partial application, is the parameter passing always follow from the left to right?
    Yes. It follows parameter passing order always follow from the left to right.

  6. While finding the type by type inference do we actually handle the runtime case?
    For example :
    let x = 2147483647 + 1 No error
    let y = 2147483648 Shows error FS1147: This number is outside the allowable range for 32-bit signed integers.
    Compiler will check only the current values is in the set of supported values of the type. It will not do any operation to verify the type.

F# Software Foundation’s Mentorship Program

It is my pleasure to share that I have got selected in F# Software Foundation’s Mentorship Program. I got a great mentor, Oleg Golovin. As per the discussion we decided to meet one hour in every weekend using Skype. For the first 30 minutes will to clearing my doubts regarding F# and next 30 minutes will be used to do solving challenges in HackerRank via pair programming. I am totally excited about this. Hoping I will fully utilize this opportunity.
First week meet up held on 10-Sep-2017, we had a Skype call. We started off with a general introduction to F#. I discussed about how I am learning F# now and which all materials I am following. Oleg suggested me to refer the F# for fun and profit web site and suggested on practicing more questions from the HackerRank Functional Programming.
My setup: Mac OS X, VS Code, Ionide.

New concepts I learned

  • map
    map method on a list, will apply the passing function to each element and create a new list. map is supported with Array, List, Seq , etc.
1
2
3
4
5
let list1 = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]
let list2 =
list1
|> List.map (fun x -> x * 2)
// list2 = [2; 4; 6; 8; 10; 12; 14; 16; 18; 20]
  • fold
    A “fold” operation applies given function to each element in a list and pass around the accumulator which is initialized. Returns the accumulator as the result of the fold operation. fold is supported with Array, List, Seq, Set and Map.
1
2
3
4
5
6
let list1 = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]
let sum =
list1
|> List.fold (fun elmnt sum -> elmnt + sum) 0 // we are initializing
// return value of the function will be passed as accumulator(sum) in next iteration.
// sum = 55
  • How calling function is different
1
2
3
4
let foo() =
printfn "Hello world"
foo // this won't call the foo it will return the function
foo() // this will call the actual foo since we are passing `()` unit type

In F# every function accept a single parameter we have to pass the unit type () even if the function accepts nothing.

  • unit type
    unit type means absence of any values. It is just a placeholder to use the when no value is available or required. Its value is (). For example functions like printf will return nothing but a unit type. TLDR : F# unit = C# void

Here are the few questions we discussed.

  1. Type representation is little confusing, like tuple is (a, b) but in type highlighter will show it as a * b , Is there a good place look for more details to understand about this type representation?
    I think this is done in part to avoid confusion with multi-argument generics like ‘Dictionary<string, int>’. Imagine that dictionary values is tuples of two integers ‘(int, int)’. So, ‘Dictionary<string, int, int>’? ‘Dictionary<string, (int, int)>’? To me, ‘Dictionary<string, int * int>’ is cleaner, because ‘*’ is easily recognized as tuple mark, where for ‘()’ you have to look more carefully into type definition.

  2. Why do we need to explicitly specify a function is recursive using rec?
    There is excellent answer here: Stack Overflow Basically, that’s just historical choice.

  3. Won’t the immutability cause to use more memory, since every time we create a new change we are creating a new object/value?
    Yes, immutability would cause to use a lot more memory… if you are careless with collections and object passing. Also, FSharp structures is heavily optimized. So, for example, when you add new element to the list, it doesn’t create new copy of a list with all copies of its elements. FSharp just creates one new list element, marks it as head for ‘new’ list, and attaches old list as tail. Arrays are not that way, they actually copy all of their contents.

  4. I read some where using mutable variable is not a functional way, in F# do we always try or prefer to not use mutable variables?
    It’s better when your function always returns the same result with the same inputs. But if we declare and use mutable variable somewhere inside that function - the ‘same result’ guarantee is lower. If we use mutable variable that’s declared elsewhere - there’s no guarantee, that some other function hasn’t changed it, so we wouldn’t get ‘same result’. Of course going strict ‘no mutables’ is not a good way. I find it preferable to not use public mutables - when you absolutely need to have some state that changes over time on long-living entity, it’s totally okay to use private mutable.

  5. Why arrays are mutable while lists are not?
    Array in F# has the same base as in C# - System.Array. So, naturally, they behave the same as in C#. Lists, on the other hand, is immutable special ‘FSharpList’. When you add an element, you actually create new list as I described above. If you try to mutate the element in the list - the head that previously was pointing to that element is now pointing into corrupted memory, because the list is singly-linked. You change the element - and the tail gets disconnected from former head.

Introduction to F# in Mac OS

In short we will be setting up in the below order.

  • Install .NET Core.
  • Install VS Code.
  • Install Ionide.

Installing .NET Core

Download and install .NET Core SDK from .NET Core for Mac.

Installing VS Code

Download Visual Studio Code for Mac and Install.

Installing Ionide

Ionide is a plugin to support F# language features for VS Code. Open VS Code, press Cmd+P and enter the command ext install Ionide-fsharp to install the Ionide package.
Or search ionide in VS Code extensions and install from there.

Hello World

  1. Create a solution to have multiple projects.
1
dotnet new sln --name Everything

If we did not specify the --name it will take the folder name as the solution name.
2. Create F# console project and add it to the solution.

1
dotnet new console -lang f# -o hwFSharpApp

In above command -o hwFSharpApp sets an output directory of hwFSharpApp and creates hwFSharpApp.fsproj. console -lang F# will create a console app in F# language.

1
dotnet sln add hwFSharpApp/hwFSharpApp.fsproj 

This will add project hwFSharpApp/hwFSharpApp.fsproj to the solution.
3. Build and run.
The below command will build the solution with all the projects.

1
dotnet build Everything.sln 

To run the console application use the below command with dotnet run which specifies the projects to run.

1
dotnet run --project hwFSharpApp/hwFSharpApp.fsproj 

alt text
4. Use VS Code to edit.
Using VS Code open the folder with solution(Everything.sln) we created. We can use the F# Project Explorer to Build Run and Debug the F# Projects by setting it as startup project.
alt text
Use --help to explore more options in .NET CLI.

More Details

  1. Use F# on Mac OSX
  2. Get started with F# and .NET Core

Introduction to Git

Git, a version control created by Linus, creator of Linux. It was created for managing contributions to linux code base. This post is covering some of the basic commands which used regularly.

Git Terms

Branch : Branch is to start a new line of development. It will be independent of the main and act as new workspace. It will allow the developer to move in new working directory without messing the main code.

Remote : A remote repository is hosted in internet and we keep our project copy there. It will act as a center repository for the project and different local repositories will push code to the remote.
$ git remote - to list all the remotes.

Upstream : It is used to refer the original repository we used to fork. While we fork a repository, it will not set the upstream by default. We must configure a upstream repository in Git to sync changes made in the original repository.

Forking : Forking gives a way to create a copy of server side repository from the original server repository. It will act as the remote for the local repository and allow the contributor to have a public repository to share. Contributor will have both private local one and public server-side one.

Git Commands

  • Forking the repository
    In the top-right corner of the page, click Fork. Clicking on the fork button from project page will create a forked project under your account.
  • Cloning the forked repository
    $ git clone https://github.com/user/awesome.git
  • Setting up upstream
    $ git remote add upstream https://github.com/original_author/awesome.git
    After setting up the upstream, we can sync the upstream to get changes made in the original repository.
  • Creating a new branch
    $ git checkout -b MyNewBranch - This will create a new branch and checkout it.
  • Committing changes to the new branch
    $ git push origin MyNewBranch
  • Deleting the branch after merging
    $ git branch -D MyNewBranch - This will delete the branch even if we did not merge or rebase it.
  • Pruning all branches
    $ git remote prune origin
  • Fetching changes from the upstream
    To get the latest changes made in the original repository, we can fetch it from upstream.
    $ git fetch upstream
  • Merging our master with upstream
    $ git merge upstream/master

More Reference

0%